From: Manfred Spraul The last change to improve the scalability moved the actual wake-up out of the section that is protected by spin_lock(sma->sem_perm.lock). This means that IN_WAKEUP can be in queue.status even when the spinlock is acquired by the current task. Thus the same loop that is performed when queue.status is read without the spinlock acquired must be performed when the spinlock is acquired. In addition, user space may assume that semtimedop() is a memory barrier(). Thus add a smp_mb() into the lockless return path - otherwise the code would return after acquiring a semaphore without a memory barrier. Thanks to kamezawa.hiroyu@jp.fujitsu.com for noticing lack of the memory barrier. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16255 [akpm@linux-foundation.org: clean up kerneldoc, checkpatch warning and whitespace] Signed-off-by: Manfred Spraul Reported-by: Luca Tettamanti Reported-by: Christoph Lameter Cc: Maciej Rutecki --- diff --git a/ipc/sem.c b/ipc/sem.c index 506c849..40a8f46 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -1256,6 +1256,33 @@ out: return un; } + +/** + * get_queue_result - Retrieve the result code from sem_queue + * @q: Pointer to queue structure + * + * Retrieve the return code from the pending queue. If IN_WAKEUP is found in + * q->status, then we must loop until the value is replaced with the final + * value: This may happen if a task is woken up by an unrelated event (e.g. + * signal) and in parallel the task is woken up by another task because it got + * the requested semaphores. + * + * The function can be called with or without holding the semaphore spinlock. + */ +static int get_queue_result(struct sem_queue *q) +{ + int error; + + error = q->status; + while (unlikely(error == IN_WAKEUP)) { + cpu_relax(); + error = q->status; + } + + return error; +} + + SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops, unsigned, nsops, const struct timespec __user *, timeout) { @@ -1409,15 +1436,18 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops, else schedule(); - error = queue.status; - while(unlikely(error == IN_WAKEUP)) { - cpu_relax(); - error = queue.status; - } + error = get_queue_result(&queue); if (error != -EINTR) { /* fast path: update_queue already obtained all requested - * resources */ + * resources. + * Perform a smp_mb(): User space could assume that semop() + * is a memory barrier: Without the mb(), the cpu could + * speculatively read in user space stale data that was + * overwritten by the previous owner of the semaphore. + */ + smp_mb(); + goto out_free; } @@ -1427,10 +1457,12 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops, goto out_free; } + error = get_queue_result(&queue); + /* * If queue.status != -EINTR we are woken up by another process */ - error = queue.status; + if (error != -EINTR) { goto out_unlock_free; }