吕慧伟
ASG 小组讨论班
http://asg.ict.ac.cn/lhw
2011.7
无锁编程简介
An Intro to Lock-free Programming
2
提纲
●无锁编程概述
‒ 动机:锁开销影响并行程序扩展性
‒ What/Why/How
●无锁编程实例
‒ 无锁队列
●无锁编程研究问题
‒ 事务内存
‒ 无锁数据结构
Art of Multiprocessor Programming 3
Multicore Scaling Process
User code
Multicore
Speedup 1.8x1.8x
7x7x
3.6x3.6x
Unfortunately, not so simple…
[Maurice 08]
Art of Multiprocessor Programming 4
Real-World Scaling Process
1.8x1.8x 2x2x
2.9x2.9x
User code
Multicore
Speedup
Parallelization and Synchronization
require great care…
[Maurice 08]
Art of Multiprocessor Programming 5
Asynchrony
• Sudden unpredictable delays
– Cache misses (short)
– Page faults (long)
– Scheduling quantum used up (really
long)
[Maurice 08]
6
What:什么是无锁编程?
●如果一个共享数据结构的操作不需要互
斥,那么它是无锁的。如果一个进程在操
作中间被中断,其他进程的操作不受影
响。 [Herlihy 1991]
●并行算法同步的分类
‒ 阻塞同步 (mutex,semaphore,...)
‒ 无阻塞同步 [LLF10]
– 无等待∈无锁∈无阻碍
– wait-free∈lock-free∈obstruction-free
7
Why:为什么要无锁?
●性能考虑
‒ 对一些应用性能更好
●避免锁的使用引起的错误和问题
‒ 死锁:两个以上进程互等结束
‒ Convoy :多个进程反复竞争同一个锁,抢占
锁失败后强制上下文切换。引起性能下降。
‒ 优先级反转:低优先级进程拥有锁时被抢占
8
How:如何无锁?
●方法:像事务一样操作[DRD08],知道谁拥
有数据
‒ 不同进程对私有数据更新互相隔离
‒ 提交操作是原子的:或者成功,或者丢弃
‒ 一致性:确保从一个状态变到另一状态
●工具:原子指令
●使用注意点
‒ 无锁算法要求从硬件并行的角度考虑算法
‒ 设计通用无锁算法很困难,一般只设计无锁的数
据结构
9
提纲
●无锁编程概述
●无锁编程实例
‒ 单生产者单消费者 FIFO:不需要原子指令
‒ 多生产者多消费者 FIFO:原子指令
‒ 无锁堆栈、 ABA问题
●无锁编程研究问题
10
Lamport's Lock-Free
Ring Buffer
readwrite
0N-1
Insert(T element)
1: wait until NEXT(write) != read
2: buffer[write] = element
3: write = NEXT(write)
Extract(T* element)
1: wait until read != write
2: *element = buffer[read]
3: read = NEXT(read)
NEXT(x) = (x + 1) % N
[Lamport, Comm. of ACM, 1977]
● Operate on control variables: read and write, which resp.
point to next read and write slots
slide taken from [Lee10]
11
CAS(cmpxchg)
slide taken from [CS380D]
12
多生产者多消费者
FIFO
void enQ(request, queue)
{
do{
local_head = queue->head;
request->next = queue->head;
val = cmpxchg(&queue->head,
local_head, request);
}while(val != local_head)
}
事务操作状态
Atomic & Consistency: cmpxchg
Isolation: 私有数据
13
多生产者多消费者
FIFO
void enQ(request, queue)
{
do{
local_head = queue->head;
request->next = queue->head;
val = cmpxchg(&queue->head,
local_head, request);
}while(val != local_head)
}
1
2
1 2
2
事务操作状态
如果在执行 cmpxchg时 , queue->head == local_head。即在
1, 2之间没有进程将 queue->head更改了。则 request赋给了
queue->head。否则,上述过程将重新进行。2 2
2
14
无锁堆栈
struct elem {
elem *link
any data;
}
elem *qhead;
Push (elem *x)
do
old = qhead;
x->link = old;
new = x;
cc = CAS(qhead, old, new);
until (cc == old;)
Pop ()
do
old = qhead;
new = old->link;
cc = CAS(qhead, old, new);
until (cc == old;)
return old;
15
ABA问题
A
B
C
thread 1:
pop()
thread 2:
pop()
pop()
push(A)
pop(); pop(); push(A); pop(); C
pop(); pop(); pop(); push(A); A
正确执行:
16
ABA问题
A
B
C
pop()
ret = A;
next = B;
// interrupted
CAS(A,A,B)
A
B
A =
17
ABA问题
A
B
C
pop()
ret = A;
next = B;
// interrupted
CAS(A,A,B)
pop()
ret = A;
next = B;
CAS(A,A,B)
A
B
A =
A
B
A =
18
ABA问题
B
C
// interrupted
CAS(A,A,B)
pop()
ret = B;
next = C;
CAS(B,B,C)
return B;
19
ABA问题
C
// interrupted
CAS(A,A,B)
push(A)
A->next = C;
CAS(C,C,A)
20
ABA问题
C
// resumes
CAS(A,A,B)A
A
B
A =
21
ABA问题
// resumes
CAS(A,A,B)Bx
22
slide taken from [CS380D]
version number
23
提纲
●无锁编程概述
●无锁编程实例
●无锁编程研究问题
‒ 无锁数据结构
‒ Transactional Memory
24
事务内存
Transactional Memory
为什么叫 Transactional Memory
• 从 Database的核心概念
Transaction借鉴
• Database一次 Transaction过
程
– 1. Begin the transaction
– 2. Execute several data manipulations
and queries
– 3. If no errors occur then commit the
transaction and end it
– 4. If errors occur then rollback the
transaction and end it
• ACI特征 ( Atomicity,
Consistency, Isolation )
Transaction State
Transactional Memory
•1977 Lomet
–(发现一种 )保证共享数据的抽象机制
•1993 Herlihy and Moss, ISCA
– Transactional Memory - architectural support
for lock-free data structures
–支持无锁数据结构的机制
•ISCA, HPCA, PPoPP, PODC...
–Challenge: to build an efficient TM
infrastructure
[TM M.K.]
Herlihy and Moss, ISCA 1993
[TM M.K.]
• coined the term Transactional Memory
•增加 6种新指令供程序员构建无锁数据结构
– load-transactional:读数据到私有寄存器
– load-transactional-exclusive:读数据到私有寄存
器,如果 cache-miss,请求数据所有权
– store-transactional:写到内存 (cache)但其他进
程在 commit之前不可见
– commit:提交内存更改
– abort:丢弃写更改
– validate:查询当前事务状态
Herlihy and Moss, ISCA 1993
[TM M.K.]
6 条新指令:
load-transactional, load-
transactional-exclusive, store-
transactional, commit, abort, and
validate.
处理器状态 flag
新增一个 T Cache缓存 TM操作
新的 3个 bus周期
Transactional Memory
• TM如何解决线程级并行?
– 程序员把计算wrap成 transaction
– 不是万能药,但是把同步操作从程序员转移到编译器、运行
环境和硬件
– 目前主要的信心来自 transaction解决了数据库的问题。尚
没有用 TM来解决一般的并行编程问题。
– 目前还只是语义级别的 TM定义 (尚没有编译器、运行环境
、库的支持 )
– TM会和非 TM的代码并存很长一段时间。 TM的成功取决于
旧代码和新代码的无缝结合。
[TM M.K.]
30
小结
●无锁编程可以避免使用锁的一些常见错
误,并有可能带来性能提高
●设计通用的无锁算法很难,目前常见的有
一些无锁的数据结构
●无锁的数据结构要求按照事务处理的方式
编程
●事务内存是对无锁的支持,目前研究很
热,难点在于有效的实现
31
参考资料
● [DRD08]Dr Dobbs, Writing Lock-Free Code: A
Corrected Queue, http://drdobbs.com/high-
performance-computing/210604448
● [LLF10] 杨小华,透过 Linux 内核看无锁编
程 ,http://www.ibm.com/developerworks/cn/lin
ux/l-cn-lockfree/
● [Maurice 08]Maurice Herlichy and Nir
Shavit, The Art of Multiprocessor
Programming
● [TM M.K.] James R. Larus and Ravi Rajwar,
Transactional Memory, published by Morgan
Kaufman
32
参考资料
● [Herlihy 1991]Maurice Herlihy, Wait-free
Synchronization,1991
● [NSYNC]Nonblocking synchronization
bibliography, http://www.cs.wisc.edu/trans-
memory/biblio/swnbs.html
● [Langdale05]Geoff Langdale, Lock-Free
Programming, www.cs.cmu.edu/~410-
s05/lectures/L31_LockFree.pdf
● [Lee10]Patrick P.C. Lee, A Lock-Free,
Cache-Efficient Multi-Core Synchronization
Mechanism for Line-Rate Network Traffic
Monitoring
33
参考资料
● [CS380D]Wait-Free Synchronization Lecture,
CS380D—Distributed Computing, The
University of Texas at Austin