lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <268f8125-1d60-a1fa-bfec-2c2763de3e55@redhat.com>
Date:   Mon, 30 Sep 2019 09:53:12 -0400
From:   Waiman Long <longman@...hat.com>
To:     Manfred Spraul <manfred@...orfullife.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        1vier1@....de
Subject: Re: [PATCH] ipc/sem: Fix race between to-be-woken task and waker

On 9/29/19 6:24 AM, Manfred Spraul wrote:
> Hi Waiman,
>
> I have now written the mail 3 times:
> Twice I thought that I found a race, but during further analysis, it
> always turns out that the spin_lock() is sufficient.
>
> First, to avoid any obvious things: Until the series with e.g.
> 27d7be1801a4824e, there was a race inside sem_lock().
>
> Thus it was possible that multiple threads were operating on the same
> semaphore array, with obviously arbitrary impact.
>
I was saying that there was a race. It was just that my initial analysis
of the code seems to indicate a race was possible.


> On 9/20/19 5:54 PM, Waiman Long wrote:
>
>>   +        /*
>> +         * A spurious wakeup at the right moment can cause race
>> +         * between the to-be-woken task and the waker leading to
>> +         * missed wakeup. Setting state back to TASK_INTERRUPTIBLE
>> +         * before checking queue.status will ensure that the race
>> +         * won't happen.
>> +         *
>> +         *    CPU0                CPU1
>> +         *
>> +         *  <spurious wakeup>        wake_up_sem_queue_prepare():
>> +         *  state = TASK_INTERRUPTIBLE    status = error
>> +         *                try_to_wake_up():
>> +         *  smp_mb()              smp_mb()
>> +         *  if (status == -EINTR)      if (!(p->state & state))
>> +         *    schedule()            goto out
>> +         */
>> +        set_current_state(TASK_INTERRUPTIBLE);
>> +
>
> So the the hypothesis is that we have a race due to the optimization
> within try_to_wake_up():
> If the status is already TASK_RUNNING, then the wakeup is a nop.
>
> Correct?
>
> The waker wants to use:
>
>     lock();
>     set_conditions();
>     unlock();
>
> as the wake_q is a shared list, completely asynchroneously this will
> happen:
>
>     smp_mb(); //// ***1
>     if (current->state = TASK_INTERRUPTIBLE) current->state=TASK_RUNNING;
>
> The only guarantee is that this will happen after lock(), it may
> happen before set_conditions().
>
> The task that goes to sleep uses:
>
>     lock();
>     check_conditions();
>     __set_current_state();
>     unlock(); //// ***2
>     schedule();
>
> You propose to change that to:
>
>     lock();
>     set_current_state();
>     check_conditions();
>     unlock();
>     schedule();
>
> I don't see a race anymore, and I don't see how the proposed change
> will help.
> e.g.: __set_current_state() and smp_mb() have paired memory barriers
> ***1 and ***2 above. 

Now that I had taken a second look at it. I agreed that the current code
should be fine. My only comment is that we should probably add extra
comments to clarify the situation so that it won't get messed up in the
future.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ