参考书籍: 《深入Linux设备驱动程序内核机制》
自旋锁 spinlock
设计自旋锁的最初目的是在多处理器系统中提供对共享数据的保护,其背后的核心思想是:设置一个在多处理器之间共享的全局变量锁V,并定义当V=1时为上锁状态,V=0为解锁状态。如果处理器A上的代码要进入临界区,它要先读取V的值,判断其是否为0,如果V≠0表明有其他处理器上的代码正在对共享数据进行访问,此时处理器A进入忙等待即自旋状态,如果V=0表明当前没有其他处理器上的代码进入临界区,此时处理器A可以访问该资源,它先把V置1(自旋锁的上锁状态),然后进入临界区,访问完毕离开临界区时将V置0(自旋锁的解锁状态)。
上述自旋锁的设计思想在用具体代码实现时的关键之处在于,必须**确保处理器A“读取V,判断V的值与更新V”这一操作序列是个原子操作(atomic operation)**。所谓原子操作,简单地说就是执行这个操作的指令序列在处理器上执行时等同于单条指令,也即该指令序列在执行时是不可分割的
spin_lock与spin_unlock
下面是Linux源码中提供给设备驱动程序等内核模块使用的spin_lock接口函数的定义:
// include/linux/spinlock.h
static inline void spin_lock(spinlock_t *lock)
{
raw_spin_lock(&lock->rlock);
}
代码中的数据结构spinlock_t,就是前面提到的在多处理器之间共享的自旋锁在现实源码中的具体表现,透过层层的定义,会发现实际上它就是个volatile unsigned int型变量:
typedef struct spinlock {
union {
struct raw_spinlock rlock;
};
} spinlock_t;
typedef struct raw_spinlock {
arch_spinlock_t raw_lock;
} raw_spinlock_t;
typedef struct {
volatile unsigned int lock;
} arch_spinlock_t;
spin_lock函数中调用的raw_spin_lock是个宏,其实现是处理器相关的,对于ARM处理器而言,最终展开为:
#define raw_spin_lock(lock) _raw_spin_lock(lock)
static inline void __raw_spin_lock(raw_spinlock_t *lock)
{
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
}
static inline void do_raw_spin_lock(raw_spinlock_t *lock) __acquires(lock)
{
__acquire(lock);
arch_spin_lock(&lock->raw_lock);
}
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
unsigned long tmp;
__asm__ __volatile__(
"1: ldrex %0, [%1]\n"
" teq %0, #0\n"
WFE("ne")
" strexeq %0, %2, [%1]\n"
" teqeq %0, #0\n"
" bne 1b"
: "=&r" (tmp)
: "r" (&lock->lock), "r" (1)
: "cc");
smp_mb();
}
do_raw_spin_lock函数中嵌入的汇编代码段是ARM处理器上实现自旋锁的核心代码,它通过使用ARM处理器上专门用以实现互斥访问的指令ldrex和strex来达到原子操作的目的:
- "ldrex %0, [%1]“相当于"tmp = lock->raw_lock”,即读取自旋锁V的初始状态,放在临时变量tmp中
- "teq %0, #0"判断V是否为0,如果不为0,表明此时自旋锁处于上锁状态,代码执行"bne 1b"指令,开始进入忙等待:不停地到标号1处读取自旋锁的状态,并判断是否为0
- 如果不为 0,执行 WFE (Wait For Event)让处理器进入低功耗状态等待事件
- strexeq %0, %2, [%1]"这条指令是说,如果V=0(自旋锁处于解锁的状态),说明可以进入临界区,那么就用常量1来更新V的值,并把更新操作执行的结果放到变量tmp中
- "teqeq %0, #0"用来判断上一条指令对V的更新操作其结果tmp是否为0,如果是0则表明更新V的操作成功,此时V=1,代码可以进入临界区。如果tmp≠0,则表明更新V的操作没有成功,代码执行"bne 1b"指令进入忙等
这里之所以要执行"teqeq %0, #0",正是要利用ldrex和strex指令来达成原子操作的目的
与spin_lock相对的是spin_unlock函数,这是一个应该在离开临界区时调用的函数,用来释放此前获得的自旋锁。其外部接口定义如下:
static inline void spin_unlock(spinlock_t *lock)
{
raw_spin_unlock(&lock->rlock);
}
#define raw_spin_unlock(lock) _raw_spin_unlock(lock)
static inline void __raw_spin_unlock(raw_spinlock_t *lock)
{
spin_release(&lock->dep_map, 1, _RET_IP_);
do_raw_spin_unlock(lock);
preempt_enable();
}
static inline void do_raw_spin_unlock(raw_spinlock_t *lock) __releases(lock)
{
arch_spin_unlock(&lock->raw_lock);
__release(lock);
}
static inline void arch_spin_unlock(arch_spinlock_t *lock)
{
smp_mb();
__asm__ __volatile__(
" str %1, [%0]\n"
:
: "r" (&lock->lock), "r" (0)
: "cc");
dsb_sev();
}
spin_lock的变体
考虑如下场景:
处理器上的当前进程A因为要对某一全局性的链表g_list进行操作,所以在操作前通过调用spin_lock来进入临界区,当它正处于临界区中时,进程A所在的处理器上发生了一个外部硬件中断,此时系统必须暂停当前进程A的执行转而去处理该中断,假设该中断的处理例程中恰好也要操作g_list,因为这是一个共享的全局变量,所以在操作之前也要调用spin_lock函数来对该共享变量进行保护,当中断处理例程中的spin_lock试图去获得自旋锁slock时,因为被它中断的进程A之前已经获得该锁,于是就卡死了
因处理外部的中断而引发spin_lock缺陷的例子,使得必须在这种情况下对spin_lock予以修正,于是出现了spin_lock_irq和spin_lock_irqsave函数。spin_lock_irq函数接口定义如下
static inline void spin_lock_irq(spinlock_t *lock)
{
raw_spin_lock_irq(&lock->rlock);
}
#define raw_spin_lock_irq(lock) _raw_spin_lock_irq(lock)
#define _raw_spin_lock_irq(lock) __raw_spin_lock_irq(lock)
static inline void __raw_spin_lock_irq(raw_spinlock_t *lock)
{
local_irq_disable();
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
}
其中的raw_spin_lock_irq函数的实现,相对于raw_spin_lock只是在调用preempt_disable之前又调用了local_irq_disable()
如此,当知道一个自旋锁在中断处理的上下文中有可能会被使用到时,应该使用spin_lock_irq函数,而不是spin_lock,后者只有在能确定中断上下文中不会使用到自旋锁的情形下才能使用。spin_lock_irq对应的释放锁函数为spin_unlock_irq,其接口定义为:
static inline void spin_unlock_irq(spinlock_t *lock)
{
raw_spin_unlock_irq(&lock->rlock);
}
#define raw_spin_unlock_irq(lock) _raw_spin_unlock_irq(lock)
#define _raw_spin_unlock_irq(lock) __raw_spin_unlock_irq(lock)
static inline void __raw_spin_unlock_irq(raw_spinlock_t *lock)
{
spin_release(&lock->dep_map, 1, _RET_IP_);
do_raw_spin_unlock(lock);
local_irq_enable();
preempt_enable();
}
与spin_lock_irq类似的还有一个spin_lock_irqsave宏,它与spin_lock_irq函数最大的区别是,在关闭中断前会将处理器当前的FLAGS寄存器的值保存在一个变量中,当调用对应的spin_unlock_irqrestore来释放锁时,会将spin_lock_irqsave中保存的FLAGS值重新写回到寄存器中
另一个与中断处理相关的spinlock版本是spin_lock_bh函数,该函数用来处理进程与延迟处理导致的并发中的互斥问题。相对于spin_lock_irq函数,spin_lock_bh用来关闭softirq的能力
最后,自旋锁还设计了一组对应的非阻塞的版本,分别是:
static inline int spin_trylock(spinlock_t *lock);
static inline int spin_trylock_irq(spinlock_t *lock);
spin_trylock_irqsave(lock, flags);
int spin_trylock_bh(spinlock_t *lock);
读写锁 rwlock
与之前的spin_lock类比起来,这种锁比较有意思的地方在于:它允许任意数量的读取者同时进入临界区,但写入者必须进行互斥访问。一个进程想去读的话,必须检查是否有进程正在写,有的话必须自旋,否则可以获得锁。一个进程想去写的话,必须先检查是否有进程正在读或者写,有的话必须自旋
相比较spinlock,rwlock在锁的定义以及irq与preempt操作方面没有任何不同,唯一不同的是,rwlock针对读和写都设计了各自的锁操作函数,这些核心的上锁/解锁操作都是平台相关的,下面以ARM处理器为例,看看rwlock的实现机制
先看写入者的上锁操作:
#define write_lock(lock) _raw_write_lock(lock)
#define _raw_write_lock(lock) __raw_write_lock(lock)
static inline void __raw_write_lock(rwlock_t *lock)
{
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
}
# define do_raw_write_lock(rwlock) do {__acquire(lock); arch_write_lock(&(rwlock)->raw_lock); } while (0)
static inline void arch_write_lock(arch_rwlock_t *rw)
{
unsigned long tmp;
__asm__ __volatile__(
"1: ldrex %0, [%1]\n"
" teq %0, #0\n"
WFE("ne")
" strexeq %0, %2, [%1]\n"
" teq %0, #0\n"
" bne 1b"
: "=&r" (tmp)
: "r" (&rw->lock), "r" (0x80000000)
: "cc");
smp_mb();
}
代码先把lock的值读进来,然后测试它是否为0,如果是0(表明没有人在使用锁,由此可见写入者要想成功获得锁,必须保证此前没有进程正在该锁上进行读或者写,因为一个进程不管因为读或者写而获得锁,都会改变锁的值使之不为0),那么用0x80000000去更新lock的值。如果lock的值不为0,表明之前该锁已被别的进程所使用(读或者写进程),那么该进程将进入自旋状态("bne 1b"指令的意思是跳转到后面的标号1处执行)
写入者的解锁操作:
// 前面的调用链略...
static inline void arch_write_unlock(arch_rwlock_t *rw)
{
smp_mb();
__asm__ __volatile__(
"str %1, [%0]\n"
:
: "r" (&rw->lock), "r" (0)
: "cc");
dsb_sev();
}
代码很简单,将lock的值设为0
再看读取者的上锁操作:
// 前面的调用链略...
static inline void arch_read_lock(arch_rwlock_t *rw)
{
unsigned long tmp, tmp2;
__asm__ __volatile__(
"1: ldrex %0, [%2]\n" // 独占方式加载当前锁的值到 tmp (%0)
" adds %0, %0, #1\n" // tmp 加 1,并更新条件标志位
" strexpl %1, %0, [%2]\n" // 如果结果为正数(pl),尝试写回新值
WFE("mi") // 如果结果为负数(mi),等待事件
" rsbpls %0, %1, #0\n" // 如果之前是正数,计算 0 减去 strex 的结果
" bmi 1b" // 如果rsbpls的结果是负数,跳回重试
: "=&r" (tmp), "=&r" (tmp2)
: "r" (&rw->lock)
: "cc");
smp_mb();
}
读取者解锁操作:
static inline void arch_read_unlock(arch_rwlock_t *rw)
{
unsigned long tmp, tmp2;
smp_mb();
__asm__ __volatile__(
"1: ldrex %0, [%2]\n"
" sub %0, %0, #1\n"
" strex %1, %0, [%2]\n"
" teq %1, #0\n"
" bne 1b"
: "=&r" (tmp), "=&r" (tmp2)
: "r" (&rw->lock)
: "cc");
if (tmp == 0)
dsb_sev();
}
读取者的解锁操作主要是将lock值减1,因为上锁时读取者的操作是加1。但是因为临界区可能有多个读取者,所以此处应该注意确保多个读取者对lock值的减1不会出现混乱
相对于spinlock的多个版本,rwlock同样有多个版本,这里就不赘述了
信号量 semaphore
相对于自旋锁,信号量的最大特点是允许调用它的线程进入睡眠状态。这意味着试图获得某一信号量的进程会导致对处理器拥有权的丧失,也即出现进程的切换
信号量的定义如下:
// include/linux/semaphore.h
struct semaphore {
spinlock_t lock;
unsigned int count;
struct list_head wait_list;
};
其中,lock是个自旋锁变量,用于实现对信号量的另一个成员count的原子操作。无符号整型变量count用于表示通过该信号量允许进入临界区的执行路径的个数。wait_list用于管理所有在该信号量上睡眠的进程,无法获得该信号量的进程将进入睡眠状态。如果驱动程序中定义了一个struct semaphore型的信号量变量,需要注意的是不要直接对该变量的成员进行赋值,而应该使用sema_init函数来初始化该信号量。sema_init函数定义如下:
static inline void sema_init(struct semaphore *sem, int val)
{
static struct lock_class_key __key;
*sem = (struct semaphore) __SEMAPHORE_INITIALIZER(*sem, val);
lockdep_init_map(&sem->lock.dep_map, "semaphore->lock", &__key, 0);
}
#define__SEMAPHORE_INITIALIZER(name,n) \
{ \
.lock =__SPIN_LOCK_UNLOCKED((name).lock), \
.count =n, \
.wait_list =LIST_HEAD_INIT((name).wait_list), \
}
信号量上的主要操作是DOWN和UP,在Linux内核中对信号量的DOWN操作有:
void down(struct semaphore *sem);
int down_interruptible(struct semaphore *sem);
int down_killable(struct semaphore *sem);
int down_trylock(struct semaphore *sem);
int down_timeout(struct semaphore *sem, long jiffies);
上面这些函数中,驱动程序使用最频繁的是down_interruptible函数,:
int down_interruptible(struct semaphore *sem)
{
unsigned long flags;
int result = 0;
spin_lock_irqsave(&sem->lock, flags);
if (likely(sem->count > 0))
sem->count--;
else
result = __down_interruptible(sem);
spin_unlock_irqrestore(&sem->lock, flags);
return result;
}
static noinline int __sched __down_interruptible(struct semaphore *sem)
{
return __down_common(sem, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
}
static inline int __sched __down_common(struct semaphore *sem, long state,
long timeout)
{
struct task_struct *task = current;
struct semaphore_waiter waiter;
list_add_tail(&waiter.list, &sem->wait_list);
waiter.task = task;
waiter.up = 0;
for (;;) {
if (signal_pending_state(state, task))
goto interrupted;
if (timeout <= 0)
goto timed_out;
__set_task_state(task, state);
spin_unlock_irq(&sem->lock);
timeout = schedule_timeout(timeout);
spin_lock_irq(&sem->lock);
if (waiter.up)
return 0;
}
timed_out:
list_del(&waiter.list);
return -ETIME;
interrupted:
list_del(&waiter.list);
return -EINTR;
}
相对众多版本的DOWN操作,Linux下只有一个UP函数:
// kernel/semaphore.c
void up(struct semaphore *sem)
{
unsigned long flags;
spin_lock_irqsave(&sem->lock, flags);
if (likely(list_empty(&sem->wait_list)))
sem->count++;
else
__up(sem);
spin_unlock_irqrestore(&sem->lock, flags);
}
// kernel/semaphore.c
static noinline void __sched __up(struct semaphore *sem)
{
struct semaphore_waiter *waiter = list_first_entry(&sem->wait_list,
struct semaphore_waiter, list);
list_del(&waiter->list);
waiter->up = 1;
wake_up_process(waiter->task);
}
如同spinlock一样,如果对操作共享资源的访问类型进行细分,在普通信号量的基础上可以实现读取者与写入者信号量。这里的概念完全等同于读取者与写入者自旋锁,所以下面将不再仔细讨论读取者与写入者信号量的实现机制
互斥锁 mutex
用count=1的信号量实现的互斥方法还不是Linux下经典的用法,Linux内核针对count=1的信号量重新定义了一个新的数据结构struct mutex,一般都称其为互斥锁或者互斥体。同时内核根据使用场景的不同,把用于信号量的DOWN和UP操作在struct mutex上作了优化与扩展(无竞争情况下的快速获取,使用原子指令优化性能),专门用于这种新的数据类型
互斥锁mutex的概念本来就来自semaphore,如果去除掉那些跟调试相关的成员,struct mutex和struct semaphore并没有本质的不同:
struct mutex {
/* 1: unlocked, 0: locked, negative: locked, possible waiters */
atomic_t count;
spinlock_t wait_lock;
struct list_head wait_list;
#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_SMP)
struct thread_info *owner;
#endif
#ifdef CONFIG_DEBUG_MUTEXES
const char *name;
void *magic;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
#endif
};
定义一个静态的struct mutex变量同时初始化的方法是利用内核的DEFINE_MUTEX
// include/linux/mutex.h
#define __MUTEX_INITIALIZER(lockname) \
{ .count = ATOMIC_INIT(1) \
, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
__DEBUG_MUTEX_INITIALIZER(lockname) \
__DEP_MAP_MUTEX_INITIALIZER(lockname) }
#define DEFINE_MUTEX(mutexname) \
struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
如果在程序执行期间要初始化一个mutex变量,则可以使用mutex_init宏:
# define mutex_init(mutex) \
do { \
static struct lock_class_key __key; \
\
__mutex_init((mutex), #mutex, &__key); \
} while (0)
void
__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
{
atomic_set(&lock->count, 1);
spin_lock_init(&lock->wait_lock);
INIT_LIST_HEAD(&lock->wait_list);
mutex_clear_owner(lock);
debug_mutex_init(lock, name, key);
}
互斥锁mutex上的DOWN操作在Linux内核中为mutex_lock函数,定义如下:
void __sched mutex_lock(struct mutex *lock)
{
might_sleep();
/*
* The locking fastpath is the 1->0 transition from
* 'unlocked' into 'locked' state.
*/
__mutex_fastpath_lock(&lock->count, __mutex_lock_slowpath);
mutex_set_owner(lock);
}
static inline void
__mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
{
int __ex_flag, __res;
__asm__ (
"ldrex %0, [%2] \n\t"
"sub %0, %0, #1 \n\t"
"strex %1, %0, [%2] "
: "=&r" (__res), "=&r" (__ex_flag)
: "r" (&(count)->counter)
: "cc","memory" );
__res |= __ex_flag;
if (unlikely(__res != 0))
fail_fn(count);
}
函数的设计思想体现在__mutex_fastpath_lock和__mutex_lock_slowpath两条主线上,__mutex_fastpath_lock用来快速判断当前可否获得互斥锁,如果成功获得锁,则函数直接返回,否则进入到__mutex_lock_slowpath函数中。这种设计是基于这样一个事实:想要获得某一互斥锁的代码绝大部分时候都可以成功获得。由此延伸开来在代码层面就是, mutex_lock函数进入__mutex_lock_slowpath的概率很低
在__res |= __ex_flag执行完之后,通过if语句判断__res是否为0,有两种情况会导致__res不为0:一是在调用这个函数前count->counter=0,表明互斥锁已经被别的进程获得,这样__res = -1;二是在strex的更新操作不成功,表明当前有另外一个进程也在对count->counter进行同样的操作。这两种情况都将导致__mutex_fastpath_lock不能直接返回,而是进入fail_fn,也就是调用__mutex_lock_slowpath
__mutex_lock_slowpath的实现如下所示:
static __used noinline void __sched
__mutex_lock_slowpath(atomic_t *lock_count)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, _RET_IP_);
}
static inline int __sched
__mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
unsigned long ip)
{
struct task_struct *task = current;
struct mutex_waiter waiter;
unsigned long flags;
preempt_disable();
mutex_acquire(&lock->dep_map, subclass, 0, ip);
// 这部分实现了自旋等待机制,当锁被占用时,不会立即进入睡眠,而是先自旋一段时间尝试获取锁
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
/*
* Optimistic spinning.
*
* We try to spin for acquisition when we find that there are no
* pending waiters and the lock owner is currently running on a
* (different) CPU.
*
* The rationale is that if the lock owner is running, it is likely to
* release the lock soon.
*
* Since this needs the lock owner, and this mutex implementation
* doesn't track the owner atomically in the lock field, we need to
* track it non-atomically.
*
* We can't do this for DEBUG_MUTEXES because that relies on wait_lock
* to serialize everything.
*/
for (;;) {
struct thread_info *owner;
/*
* If we own the BKL, then don't spin. The owner of
* the mutex might be waiting on us to release the BKL.
*/
if (unlikely(current->lock_depth >= 0))
break;
/*
* If there's an owner, wait for it to either
* release the lock or go to sleep.
*/
owner = ACCESS_ONCE(lock->owner);
if (owner && !mutex_spin_on_owner(lock, owner))
break;
if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
lock_acquired(&lock->dep_map, ip);
mutex_set_owner(lock);
preempt_enable();
return 0;
}
/*
* When there's no owner, we might have preempted between the
* owner acquiring the lock and setting the owner field. If
* we're an RT task that will live-lock because we won't let
* the owner complete.
*/
if (!owner && (need_resched() || rt_task(task)))
break;
/*
* The cpu_relax() call is a compiler barrier which forces
* everything in this loop to be re-loaded. We don't need
* memory barriers as we'll eventually observe the right
* values at the cost of a few extra spins.
*/
arch_mutex_cpu_relax();
}
#endif
spin_lock_mutex(&lock->wait_lock, flags);
debug_mutex_lock_common(lock, &waiter);
debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
/* add waiting tasks to the end of the waitqueue (FIFO): */
list_add_tail(&waiter.list, &lock->wait_list);
waiter.task = task;
if (atomic_xchg(&lock->count, -1) == 1)
goto done;
lock_contended(&lock->dep_map, ip);
for (;;) {
/*
* Lets try to take the lock again - this is needed even if
* we get here for the first time (shortly after failing to
* acquire the lock), to make sure that we get a wakeup once
* it's unlocked. Later on, if we sleep, this is the
* operation that gives us the lock. We xchg it to -1, so
* that when we release the lock, we properly wake up the
* other waiters:
*/
if (atomic_xchg(&lock->count, -1) == 1)
break;
/*
* got a signal? (This code gets eliminated in the
* TASK_UNINTERRUPTIBLE case.)
*/
if (unlikely(signal_pending_state(state, task))) {
mutex_remove_waiter(lock, &waiter,
task_thread_info(task));
mutex_release(&lock->dep_map, 1, ip);
spin_unlock_mutex(&lock->wait_lock, flags);
debug_mutex_free_waiter(&waiter);
preempt_enable();
return -EINTR;
}
__set_task_state(task, state);
/* didn't get the lock, go to sleep: */
spin_unlock_mutex(&lock->wait_lock, flags);
preempt_enable_no_resched();
schedule();
preempt_disable();
spin_lock_mutex(&lock->wait_lock, flags);
}
done:
lock_acquired(&lock->dep_map, ip);
/* got the lock - rejoice! */
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
/* set it to 0 if there are no waiters left: */
if (likely(list_empty(&lock->wait_list)))
atomic_set(&lock->count, 0);
spin_unlock_mutex(&lock->wait_lock, flags);
debug_mutex_free_waiter(&waiter);
preempt_enable();
return 0;
}
互斥锁的UP操作为mutex_unlock,函数定义如下:
void __sched mutex_unlock(struct mutex *lock)
{
/*
* The unlocking fastpath is the 0->1 transition from 'locked'
* into 'unlocked' state:
*/
#ifndef CONFIG_DEBUG_MUTEXES
/*
* When debugging is enabled we must not clear the owner before time,
* the slow path will always be taken, and that clears the owner field
* after verifying that it was indeed current.
*/
mutex_clear_owner(lock);
#endif
__mutex_fastpath_unlock(&lock->count, __mutex_unlock_slowpath);
}
static inline void
__mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *))
{
int __ex_flag, __res, __orig;
__asm__ (
"ldrex %0, [%3] \n\t"
"add %1, %0, #1 \n\t"
"strex %2, %1, [%3] "
: "=&r" (__orig), "=&r" (__res), "=&r" (__ex_flag)
: "r" (&(count)->counter)
: "cc","memory" );
__orig |= __ex_flag;
if (unlikely(__orig != 0))
fail_fn(count);
}
在没有别的进程竞争该互斥锁的情况下,__mutex_fastpath_unlock函数要完成的工作最简单,把count->counter的值加1然后返回。如果有别的进程在竞争该互斥锁,那么函数进入__mutex_unlock_slowpath,这个函数主要用来唤醒在当前mutex的wait_list中休眠的进程,如同up函数一样:
static __used noinline void
__mutex_unlock_slowpath(atomic_t *lock_count)
{
__mutex_unlock_common_slowpath(lock_count, 1);
}
static inline void
__mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
unsigned long flags;
spin_lock_mutex(&lock->wait_lock, flags);
mutex_release(&lock->dep_map, nested, _RET_IP_);
debug_mutex_unlock(lock);
/*
* some architectures leave the lock unlocked in the fastpath failure
* case, others need to leave it locked. In the later case we have to
* unlock it here
*/
if (__mutex_slowpath_needs_to_unlock())
atomic_set(&lock->count, 1);
if (!list_empty(&lock->wait_list)) {
/* get the first entry from the wait-list: */
struct mutex_waiter *waiter =
list_entry(lock->wait_list.next,
struct mutex_waiter, list);
debug_mutex_wake_waiter(lock, waiter);
wake_up_process(waiter->task);
}
spin_unlock_mutex(&lock->wait_lock, flags);
}