[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250225142632.GA29585@redhat.com>
Date: Tue, 25 Feb 2025 15:26:33 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Sapkal, Swapnil" <swapnil.sapkal@....com>,
Manfred Spraul <manfred@...orfullife.com>,
Christian Brauner <brauner@...nel.org>,
David Howells <dhowells@...hat.com>,
WangYuli <wangyuli@...ontech.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
K Prateek Nayak <kprateek.nayak@....com>,
"Shenoy, Gautham Ranjal" <gautham.shenoy@....com>,
Neeraj.Upadhyay@....com
Subject: Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still
full
On 02/24, Linus Torvalds wrote:
>
> However, I see at least one case where this exclusive wakeup seems broken:
>
> /*
> * But because we didn't read anything, at this point we can
> * just return directly with -ERESTARTSYS if we're interrupted,
> * since we've done any required wakeups and there's no need
> * to mark anything accessed. And we've dropped the lock.
> */
> if (wait_event_interruptible_exclusive(pipe->rd_wait,
> pipe_readable(pipe)) < 0)
> return -ERESTARTSYS;
>
> and I'm wondering if the issue is that the *readers* got stuck,
> Because that "return -ERESTARTSYS" path now basically will by-pass the
> logic to wake up the next exclusive waiter.
I think this is fine... lets denote this reader as R.
> Because that "return -ERESTARTSYS" is *after* the reader has been on
> the rd_wait queue - and possibly gotten the only wakeup that any of
> the readers will ever get - and now it returns without waking up any
> other reader.
I think this can't happen. ___wait_event() does
init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0); \
for (;;) { \
long __int = prepare_to_wait_event(&wq_head, &__wq_entry, state);\
\
if (condition) \
break; \
\
if (___wait_is_interruptible(state) && __int) { \
__ret = __int; \
goto __out; \
} \
\
cmd; \
} \
and in this case condition == pipe_readable(pipe), cmd == schedule().
Suppose that R got that only wakeup, and wake_up() races with some signal
so that signal_pending(R) is true.
In this case prepare_to_wait_event() will return -ERESTARTSYS, but
___wait_event() won't return this error code, it will check pipe_readable()
and return 0.
After that R will restart the main loop with wake_next_reader = true,
and whatever it does it should do wake_up(pipe->rd_wait) before return.
Note also that prepare_to_wait_event() removes the waiter from the
wait_queue_head->head list, so another wake_up() can't pick this task.
Can ___wait_event() miss the pipe_readable() event in this case? No,
both wake_up() and prepare_to_wait_event() take the same wq_head->lock.
What if pipe_readable() is actually false? Say, a spurios wakeup or, say,
pipe_write() does wake_up(rd_wait) when another reader has already made
the pipe_readable() condition false? This case looks "obviously fine" too.
So I am still confused.
I will wait for reply from Sapkal, then I'll try to make a debugging patch.
Oleg.
Powered by blists - more mailing lists