[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250326121946.GC30181@redhat.com>
Date: Wed, 26 Mar 2025 13:19:47 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Dominique Martinet <asmadeus@...ewreck.org>
Cc: K Prateek Nayak <kprateek.nayak@....com>,
Eric Van Hensbergen <ericvh@...nel.org>,
Latchesar Ionkov <lucho@...kov.net>,
Christian Schoenebeck <linux_oss@...debyte.com>,
Mateusz Guzik <mjguzik@...il.com>,
syzbot <syzbot+62262fdc0e01d99573fc@...kaller.appspotmail.com>,
brauner@...nel.org, dhowells@...hat.com, jack@...e.cz,
jlayton@...nel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, netfs@...ts.linux.dev,
swapnil.sapkal@....com, syzkaller-bugs@...glegroups.com,
viro@...iv.linux.org.uk, v9fs@...ts.linux.dev
Subject: Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
On 03/25, Dominique Martinet wrote:
>
> Thanks for the traces.
>
> w/ revert
> K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
> > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll
> > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll
> > kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7
>
> new behavior
> > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll
> > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll
> > repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512)
>
> For me the problem isn't so much that this gets ERESTARTSYS but that it
> nevers gets to read the 7 bytes that are available?
Yes...
OK, lets first recall what the commit aaec5a95d59615523 ("pipe_read:
don't wake up the writer if the pipe is still full") does.
It simply removes the unnecessary/spurious wakeups when the writer
can't add more data to the pipe.
See the "stupid test-cas" in
https://lore.kernel.org/all/20250120144338.GC7432@redhat.com/
In particular this note:
As you can see, without this patch pipe_read() wakes the writer up
4095 times for no reason, the writer burns a bit of CPU and blocks
again after wakeup until the last read(fd[0], &c, 1).
in this test-case the writer sleeps in pipe_write(), but the same is true
for the task sleeping in poll( { .fd = pipe_fd, .events = POLLOUT}, ...).
Now, after some grepping I have found
static void p9_conn_create(struct p9_client *client)
{
...
init_poll_funcptr(&m->pt, p9_pollwait);
n = p9_fd_poll(client, &m->pt, NULL);
...
}
So, iiuc, in this case p9_fd_poll(&m->pt /* != NULL */) -> p9_pollwait()
paths will add the "dummy" pwait->wait entries with ->func = p9_pollwake
to pipe_inode_info.rd_wait and pipe_inode_info.wr_wait.
Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the
ret from vfs_poll(ts->rd), but I assume this is correct.
This means that every time pipe_read() does wake_up(&pipe->wr_wait)
p9_pollwake() is called. This function kicks p9_poll_workfn() which
calls p9_poll_mux() which calls p9_fd_poll() again with pt == NULL.
In this case the conditional vfs_poll(ts->wr) looks more understandable...
So. Without the commit above, p9_poll_mux()->p9_fd_poll() can be called
much more often and, in particular, can report the "additional" EPOLLIN.
Can this somehow explain the problem?
Oleg.
Powered by blists - more mailing lists