lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj9Hr4PBobc13ZEv3HvFfpiZYrWX2-t5F62TXmMJoL5ZA@mail.gmail.com>
Date: Sat, 4 Jan 2025 14:05:49 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Manfred Spraul <manfred@...orfullife.com>, Christian Brauner <brauner@...nel.org>, 
	David Howells <dhowells@...hat.com>, WangYuli <wangyuli@...ontech.com>, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: wakeup_pipe_readers/writers() && pipe_poll()

On Thu, 2 Jan 2025 at 08:33, Oleg Nesterov <oleg@...hat.com> wrote:
>
> I was going to send a one-liner patch which adds mb() into pipe_poll()
> but then I decided to make even more spam and ask some questions first.

poll functions are not *supposed* to need memory barriers.

They are supposed to do "poll_wait()" and then not need any more
serialization after that, because we either

 (a) have a NULL wait-address, in which case we're not going to sleep
and this is just a "check state"

 (b) the waiting function is supposed to do add_wait_queue() (usually
by way of __pollwait) and that should be a sufficient barrier to
anybody who does a wakeup

Note that add_wait_queue() ends up doing a spinlock sequence, and
while that is not a full memory barrier (well, it is on x86, but not
necessarily in general), it *should* be sufficient against an actual
waker.

That's kind of how add_wait_queue() vs wake_up() is supposed to work.

Of course, the fact that we're not discussing the pipe code *not*
doing a full wake sequence (but just a "is the wait queue empty"
thing) is what then messes with the generic rules.

And this makes me think that the whole comment above
waitqueue_active() is just fundamentally wrong. The smp_mb() is *not*
sufficient in the sequence

    smp_mb();
    if (waitqueue_active(wq_head))
        wake_up(wq_head);

because while it happens to work wrt prepare_to_wait() sequences, is
is *not* against other users of add_wait_queue().

In those other sequences the smp_mb() in set_current_state might have
happened long long before.

Those other users aren't just 'poll()', btw. There's quite a lot of
add_wait_queue() users in the kernel. It's a traditional model even if
it's not something people generally add to any more.

Now, hopefully many of those add_wait_queue() users end up using
set_current_state() and getting the memory barrier that way. Or they
use wait_woken() or any of the other proper helpers we have.

But I think this poll() thing is very much an example of this *not*
being valid, and I don't think it's in any way pipe-specific.

So maybe we really do need to add the memory barrier to
__add_wait_queue(). That's going to be painful, particularly with lots
of users not needing it because they have the barrier in all the other
places.

End result: maybe adding it just to __pollwait() is the thing to do,
in the hopes that non-poll users all use the proper sequences.

But no, this is most definitely not a pipe-only thing.

          Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ