lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <addb53ac-2f46-45db-83ce-c6b28e40d831@colorfullife.com>
Date: Sat, 28 Dec 2024 17:45:15 +0100
From: Manfred Spraul <manfred@...orfullife.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
 WangYuli <wangyuli@...ontech.com>,
 linux-fsdevel <linux-fsdevel@...r.kernel.org>,
 Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
 Christian Brauner <brauner@...nel.org>
Subject: Re: [RESEND PATCH] fs/pipe: Introduce a check to skip sleeping
 processes during pipe read/write

Hi Oleg,

On 12/28/24 4:22 PM, Oleg Nesterov wrote:
> On 12/28, Oleg Nesterov wrote:
>>>   int __wake_up(struct wait_queue_head *wq_head, unsigned int mode,
>>>   	      int nr_exclusive, void *key)
>>>   {
>>> +	if (list_empty(&wq_head->head)) {
>>> +		struct list_head *pn;
>>> +
>>> +		/*
>>> +		 * pairs with spin_unlock_irqrestore(&wq_head->lock);
>>> +		 * We actually do not need to acquire wq_head->lock, we just
>>> +		 * need to be sure that there is no prepare_to_wait() that
>>> +		 * completed on any CPU before __wake_up was called.
>>> +		 * Thus instead of load_acquiring the spinlock and dropping
>>> +		 * it again, we load_acquire the next list entry and check
>>> +		 * that the list is not empty.
>>> +		 */
>>> +		pn = smp_load_acquire(&wq_head->head.next);
>>> +
>>> +		if(pn == &wq_head->head)
>>> +			return 0;
>>> +	}
>> Too subtle for me ;)
>>
>> I have some concerns, but I need to think a bit more to (try to) actually
>> understand this change.
> If nothing else, consider
>
> 	int CONDITION;
> 	wait_queue_head_t WQ;
>
> 	void wake(void)
> 	{
> 		CONDITION = 1;
> 		wake_up(WQ);
> 	}
>
> 	void wait(void)
> 	{
> 		DEFINE_WAIT_FUNC(entry, woken_wake_function);
>
> 		add_wait_queue(WQ, entry);
> 		if (!CONDITION)
> 			wait_woken(entry, ...);
> 		remove_wait_queue(WQ, entry);
> 	}
>
> this code is correct even if LOAD(CONDITION) can leak into the critical
> section in add_wait_queue(), so CPU running wait() can actually do
>
> 		// add_wait_queue
> 		spin_lock(WQ->lock);
> 		LOAD(CONDITION);	// false!
> 		list_add(entry, head);
> 		spin_unlock(WQ->lock);
>
> 		if (!false)		// result of the LOAD above
> 			wait_woken(entry, ...);
>
> Now suppose that another CPU executes wake() between LOAD(CONDITION)
> and list_add(entry, head). With your patch wait() will miss the event.
> The same for __pollwait(), I think...
>
> No?

Yes, you are right.

CONDITION =1 is worst case written to memory from the store_release() in 
spin_unlock().

this pairs with the load_acquire for spin_lock(), thus LOAD(CONDITION) 
is safe.

It could still work for prepare_to_wait and thus fs/pipe, since then the 
smb_mb() in set_current_state prevents earlier execution.


--

     Manfred


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ