linux-kernel - Re: [PATCH] ipc/sem: Fix race between to-be-woken task and waker

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6a65bfcc-661a-5ca5-86d6-0b4614f6aeea@colorfullife.com>
Date:   Fri, 27 Sep 2019 06:59:03 +0200
From:   Manfred Spraul <manfred@...orfullife.com>
To:     Waiman Long <longman@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, Arnd Bergmann <arnd@...db.de>,
        "Gustavo A. R. Silva" <gustavo@...eddedor.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Mathieu Malaterre <malat@...ian.org>,
        Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH] ipc/sem: Fix race between to-be-woken task and waker

Hi,
On 9/26/19 8:12 PM, Waiman Long wrote:
> On 9/26/19 5:34 AM, Peter Zijlstra wrote:
>> On Fri, Sep 20, 2019 at 11:54:02AM -0400, Waiman Long wrote:
>>> While looking at a customr bug report about potential missed wakeup in
>>> the system V semaphore code, I spot a potential problem.  The fact that
>>> semaphore waiter stays in TASK_RUNNING state while checking queue status
>>> may lead to missed wakeup if a spurious wakeup happens in the right
>>> moment as try_to_wake_up() will do nothing if the task state isn't right.
>>>
>>> To eliminate this possibility, the task state is now reset to
>>> TASK_INTERRUPTIBLE immediately after wakeup before checking the queue
>>> status. This should eliminate the race condition on the interaction
>>> between the queue status and the task state and fix the potential missed
>>> wakeup problem.
You are obviously right, there is a huge race condition.
>> Bah, this code always makes my head hurt.
>>
>> Yes, AFAICT the pattern it uses has been broken since 0a2b9d4c7967,
>> since that removed doing the actual wakeup from under the sem_lock(),
>> which is what it relies on.

Correct - I've overlooked that.

First, theory:

setting queue->status, reading queue->status, setting 
current->state=TASK_INTERRUPTIBLE are all under the correct spinlock.

(there is an opportunistic read of queue->status without locks, but it 
is retried when the lock got acquired)

setting current->state=RUNNING is outside of any lock.

So as far as current->state is concerned, the lock doesn't exist. And if 
the lock doesn't exist, we must follow the rules applicable for 
set_current_state().

I'll try to check the code this week.

And we should check the remaining wake-queue users, the logic is 
everywhere identical.

> After having a second look at the code again, I probably misread the
> code the first time around. In the sleeping path, there is a check of
> queue.status and setting of task state both under the sem lock in the
> sleeping path. So as long as setting of queue status is under lock, they
> should synchronize properly.
>
> It looks like queue status setting is under lock, but I can't use
> lockdep to confirm that as the locking can be done by either the array
> lock or in one of the spinlocks in the array. Are you aware of a way of
> doing that?

For testing? Have you considered just always using the global lock?

(untested):

--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -370,7 +370,7 @@ static inline int sem_lock(struct sem_array *sma, 
struct sembuf *sops,
         struct sem *sem;
         int idx;

-       if (nsops != 1) {
+       if (nsops != 1 || 1) {
                 /* Complex operation - acquire a full lock */
                 ipc_lock_object(&sma->sem_perm);


> Anyway, I do think we need to add some comment to clarify the situation
> to avoid future confusion.

Around line 190 is the comment that explains locking & memory ordering.

I have only documented the content of sem_undo and sem_array, but 
neither queue nor current->state :-(


--

     Manfred