lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1593505946.t0nxq8q8kj.astroid@bobo.none>
Date:   Tue, 30 Jun 2020 19:08:10 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>, Jan Kara <jack@...e.cz>,
        Lukas Czerner <lczerner@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: wait_on_page_bit_common(TASK_KILLABLE, EXCLUSIVE) can miss
 wakeup?

Excerpts from Oleg Nesterov's message of June 30, 2020 4:17 pm:
> On 06/30, Nicholas Piggin wrote:
>> Excerpts from Oleg Nesterov's message of June 30, 2020 12:02 am:
>> > On 06/29, Nicholas Piggin wrote:
>> >>
>> >> prepare_to_wait_event() has a pretty good pattern (and comment), I would
>> >> favour using that (test the signal when inserting on the waitqueue).
>> >>
>> >> @@ -1133,6 +1133,15 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
>> >>         for (;;) {
>> >>                 spin_lock_irq(&q->lock);
>> >>
>> >> +               if (signal_pending_state(state, current)) {
>> >> +                       /* Must not lose an exclusive wake up, see
>> >> +                        * prepare_to_wait_event comment */
>> >> +                       list_del_init(&wait->entry);
>> >> +                       spin_unlock_irq(&q->lock);
>> >> +                       ret = -EINTR;
>> >
>> > Basically this is what my patch in the 1st email does. But note that we can't
>> > just set "ret = -EINTR" here, we will need to clear "ret" if test_and_set_bit()
>> > below succeeds. That is why I used another "int intr" variable.
>>
>> You snipped off one more important line of context. No such games are
>> required AFAIKS.
> 
> 		for (;;) {
> 			spin_lock_irq(&q->lock);
> 	 
> 	+               if (signal_pending_state(state, current)) {
> 	+                       /* Must not lose an exclusive wake up, see
> 	+                        * prepare_to_wait_event comment */
> 	+                       list_del_init(&wait->entry);
> 	+                       spin_unlock_irq(&q->lock);
> 	+                       ret = -EINTR;
> 	+                       break;
> 	+               }
> 
> 
> so wait_on_page_bit_common() just returns -EINTR if signal_pending_state() == T.
> And this is wrong if "current" was already woken up by unlock_page().
> 
> That is why ___wait_event() checks the condition even if prepare_to_wait_event()
> returns -EINTR. The comment in prepare_to_wait_event() tries to explain this.

Hmm, yeah because we can loop around here with task in task sleeping 
state. Which comes back to Linus' fix. Thanks.

It looks like I broke this with 62906027091f1, then Linus mostly fixed
it in a8b169afbf06a. My patch is what actually introduced this ugly
bit test, but do we even need it at all? If we do then it's 
under-commented, I can't see it wouldn't be racy though. Can we just
get rid of it entirely?

Thanks,
Nick

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ