[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250107172512.GB29771@redhat.com>
Date: Tue, 7 Jan 2025 18:25:12 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Manfred Spraul <manfred@...orfullife.com>,
Christian Brauner <brauner@...nel.org>,
David Howells <dhowells@...hat.com>,
WangYuli <wangyuli@...ontech.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: wakeup_pipe_readers/writers() && pipe_poll()
On 01/06, Linus Torvalds wrote:
>
> On Mon, 6 Jan 2025 at 11:34, Oleg Nesterov <oleg@...hat.com> wrote:
> >
> > 1. pipe_read() says
> >
> > * But when we do wake up writers, we do so using a sync wakeup
> > * (WF_SYNC), because we want them to get going and generate more
> > * data for us.
> >
> > OK, WF_SYNC makes sense if pipe_read() or pipe_write() is going to do wait_event()
> > after wake_up(). But wake_up_interruptible_sync_poll() looks at bit misleading if
> > we are going to wakeup the writer or next_reader before return.
>
> This heuristic has always been a bit iffy. And honestly, I think it's
> been driven by benchmarks that aren't necessarily always realistic (ie
> for ping-pong benchmarks, the best behavior is often to stay on the
> same CPU and just schedule between the reader/writer).
Agreed. But my question was not about performance, I just tried to
understand this logic. So in the case of
wake_up_interruptible_sync_poll(wr_wait);
wait_event_interruptible_exclusive(wr_read);
WF_SYNC is understandable, "stay on the same CPU" looks like the right
thing, and "_sync_" matches the comment above.
But if we are going to return, wake_up_interruptible_sync_poll() looks
a bit misleading to me.
> > 2. I can't understand this code in pipe_write()
> >
> > if (ret > 0 && sb_start_write_trylock(file_inode(filp)->i_sb)) {
> > int err = file_update_time(filp);
> > if (err)
> > ret = err;
> > sb_end_write(file_inode(filp)->i_sb);
> > }
> >
> > - it only makes sense in the "fifo" case, right? When
> > i_sb->s_magic != PIPEFS_MAGIC...
>
> I think we've done it for regular pipes too. You can see it with
> 'fstat()', after all.
Ah, indeed, thanks for correcting me...
And thanks for your other explanations. Again, it is not that I thought
this needs changes, just I was a bit confused. In particular by
err = file_update_time();
if (err)
ret = err;
which doesn't match the usage of file_accessed() in pipe_read().
Oleg.
Powered by blists - more mailing lists