lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250106163038.GE7233@redhat.com>
Date: Mon, 6 Jan 2025 17:30:39 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Manfred Spraul <manfred@...orfullife.com>,
	Christian Brauner <brauner@...nel.org>,
	David Howells <dhowells@...hat.com>,
	WangYuli <wangyuli@...ontech.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: wakeup_pipe_readers/writers() && pipe_poll()

On 01/04, Linus Torvalds wrote:
>
> On Thu, 2 Jan 2025 at 08:33, Oleg Nesterov <oleg@...hat.com> wrote:
> >
> > I was going to send a one-liner patch which adds mb() into pipe_poll()
> > but then I decided to make even more spam and ask some questions first.
>
> poll functions are not *supposed* to need memory barriers.
...
> But no, this is most definitely not a pipe-only thing.

Agreed, that is why I didn't send the patch which adds mb() into pipe_poll().

> They are supposed to do "poll_wait()" and then not need any more
> serialization after that, because we either
>
>  (a) have a NULL wait-address, in which case we're not going to sleep
> and this is just a "check state"

To be honest, I don't understand the wait_address check in poll_wait(),
it seems that wait_address is never NULL.

But ->_qproc can be NULL if do_poll/etc does another iteration after
poll_schedule_timeout(), in this case we can sleep again. But this
case is fine.

>  (b) the waiting function is supposed to do add_wait_queue() (usually
> by way of __pollwait) and that should be a sufficient barrier to
> anybody who does a wakeup

Yes.

> And this makes me think that the whole comment above
> waitqueue_active() is just fundamentally wrong. The smp_mb() is *not*
> sufficient in the sequence
>
>     smp_mb();
>     if (waitqueue_active(wq_head))
>         wake_up(wq_head);
>
> because while it happens to work wrt prepare_to_wait() sequences, is
> is *not* against other users of add_wait_queue().

Well, this comment doesn't look wrong to me, but perhaps it can be
more clear. It should probably explain that this pseudo code is only
correct because the waiter has a barrier before "if (@cond)" which
pairs with smp_mb() above waitqueue_active(). It even says

	prepare_to_wait(&wq_head, &wait, state);
	// smp_mb() from set_current_state()

but perhaps this is not 100% clear.

> But I think this poll() thing is very much an example of this *not*
> being valid, and I don't think it's in any way pipe-specific.

Yes.

> So maybe we really do need to add the memory barrier to
> __add_wait_queue(). That's going to be painful, particularly with lots
> of users not needing it because they have the barrier in all the other
> places.

Yes, that is why I don't really like the idea to add mb() into
__add_wait_queue().

> End result: maybe adding it just to __pollwait() is the thing to do,
> in the hopes that non-poll users all use the proper sequences.

That is what I tried to propose. Will you agree with this change?
We can even use smp_store_mb(), say

	@@ -224,11 +224,12 @@ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address,
		if (!entry)
			return;
		entry->filp = get_file(filp);
	-	entry->wait_address = wait_address;
		entry->key = p->_key;
		init_waitqueue_func_entry(&entry->wait, pollwake);
		entry->wait.private = pwq;
		add_wait_queue(wait_address, &entry->wait);
	+	// COMMENT
	+	smp_store_mb(entry->wait_address, wait_address);

Oleg.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ