[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250311115332.GA3493@redhat.com>
Date: Tue, 11 Mar 2025 12:53:34 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Hillf Danton <hdanton@...a.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
K Prateek Nayak <kprateek.nayak@....com>,
Mateusz Guzik <mjguzik@...il.com>,
"Sapkal, Swapnil" <swapnil.sapkal@....com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still
full
On 03/11, Hillf Danton wrote:
>
> On Mon, 10 Mar 2025 14:26:17 -1000 Linus Torvalds wrote:
> > On Mon, 10 Mar 2025 at 13:34, Hillf Danton <hdanton@...a.com> wrote:
> > >
> > > The step-03 in my scenario [1] shows a reader sleeps at line-370 after
> > > making the pipe empty, so after your change that cuts the chance for
> > > waking up writer, who will wake up the sleeping reader? Nobody.
> >
> > But step-03 will wake the writer.
> >
> > And no, nobody will wake readers, because the pipe is empty. Only the
> > next writer that adds data to the pipe should wake any readers.
> >
> > Note that the logic that sets "wake_writer" and "was_empty" is all
> > protected by the pipe semaphore. So there are no races wrt figuring
> > out "should we wake readers/writers".
> >
> > So I really think you need to very explicitly point to what you think
> > the problem is. Not point to some other email. Write out all out in
> > full and explain.
> >
> In the mainline tree, conditional wakeup [2] exists before a pipe writer
> takes a nap, so scenario can be constructed based on the one in commit
> 3d252160b818 to make pipe writer sleep with nobody woken up.
>
> step-00
> pipe->head = 36
> pipe->tail = 36
>
> step-01
> task-118762 is a writer
> pipe->head++;
> wakes up task-118740 and task-118768
>
> step-02
> task-118768 is a writer
> makes pipe full;
> sleeps without waking up any reader as
> pipe was not empty after step-01
>
> Conditional wakeup also exists on the reader side [3], but Oleg cut it off [4].
>
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -360,27 +360,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
> }
> mutex_unlock(&pipe->mutex);
>
> - /*
> - * We only get here if we didn't actually read anything.
> - *
> - * However, we could have seen (and removed) a zero-sized
> - * pipe buffer, and might have made space in the buffers
> - * that way.
> - *
> - * You can't make zero-sized pipe buffers by doing an empty
> - * write (not even in packet mode), but they can happen if
> - * the writer gets an EFAULT when trying to fill a buffer
> - * that already got allocated and inserted in the buffer
> - * array.
> - *
> - * So we still need to wake up any pending writers in the
> - * _very_ unlikely case that the pipe was full, but we got
> - * no data.
> - */
> - if (unlikely(wake_writer))
> - wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
> - kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
> -
> + BUG_ON(wake_writer);
> /*
> * But because we didn't read anything, at this point we can
> * just return directly with -ERESTARTSYS if we're interrupted,
>
>
> step-03
> task-118740 is reader
> makes pipe empty
> sleeps with no writer woken up
>
> After step-03, both reader(task-118740) and writer (task-118768) sleep
> waiting for each other, with Oleg's change.
Well. I have already tried to explain this at least twice :/ Prateek too.
After step-03 task-118740 won't sleep. pipe_read() won't sleep if it has
read even one byte. Since the pipe was full and this reader makes it empty,
"wake_writer" must be true after the main loop before return from pipe_read().
This means that the reader(task-118740) will wake the writer(task-118768)
before it returns from pipe_read().
Oleg.
> PS Oleg, given no seperate reply to you, check the above scenario instead please.
>
> [2] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pipe.c#n576
> [3] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pipe.c#n381
> [4] https://lore.kernel.org/lkml/20250309170254.GA15139@redhat.com/
>
Powered by blists - more mailing lists