lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whvVWNXPJq1k566zn4SfXAifXtiA7T+7JFweR3rQ0nc9A@mail.gmail.com>
Date:   Sat, 27 Jun 2020 22:39:20 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Oleg Nesterov <oleg@...hat.com>, Nick Piggin <npiggin@...il.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Jan Kara <jack@...e.cz>, Davidlohr Bueso <dave@...olabs.net>,
        Andi Kleen <ak@...ux.intel.com>,
        Lukas Czerner <lczerner@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: wait_on_page_bit_common(TASK_KILLABLE, EXCLUSIVE) can miss wakeup?

On Fri, Jun 26, 2020 at 8:43 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> I ended up with something like the below.. but it is too warm to think
> properly.
>
> I don't particularly like WQ_FLAG_PAGEWAITERS, but I liked open-coding
> all that even less.

Ugh.

I think I have a much simpler approach, actually.

So the *problem* is purely around that

                if (behavior == EXCLUSIVE) {
                        if (!test_and_set_bit_lock(bit_nr, &page->flags))
                                break;
                } else ..

and in particular it is purely *after* that test_and_set_bit_lock()
case. We have two cases:

 (a) *If* we get the lock there, we're good, and all done, and we have
the lock. We don't care about any other wakeups, because they'll be
stale anyway (the thing that released the lock that we just took) and
because we got the lock, no other exclusive waiters should be woken up
anyway (and we will in turn wake up any waiters when we release it)

 (b) we did *not* get the lock, because somebody else got it and is
about to immediately unlock again. And that _future_ wakeup that they
send might get lost because it might end up targeting us (but we might
just exit and not care).

Agreed?

So the only case that really matters is that we didn't get the lock,
but we must *not* be woken up afterwards.

So how about the attached trivial two-liner? We solve the problem by
simply marking ourselves TASK_RUNNING, which means that we won't be
counted as an exclusive wakeup.

Ok, so the "one" line to do that is that is actually two lines:

        __set_current_state(TASK_RUNNING);
        smp_mb__before_atomic();

and there's four lines of comments to go with it, but it really is
very simple: if we do that before we do the test_and_set_bit_lock(),
no wakeups will be lost, because we won't be sleeping for that wakeup.

I'm not entirely happy about that "smp_mb__before_atomic()". I think
it's right in practice that test_and_set_bit_lock() (when it actually
does a write) has at LEAST atomic seqmantics, so I think it's good.
But it's not pretty.

But I don't want to use a heavy

        set_current_state(TASK_RUNNING);
        if (!test_and_set_bit_lock(bit_nr, &page->flags)) ..

sequence, because at least on x86, that test_and_set_bit_lock()
already has a memory barrier in it, so the extra memory barrier from
set_current_state() is all kinds of pointless.

Hmm?

                    Linus

Download attachment "patch" of type "application/octet-stream" (580 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ