[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHGdOf35YM013VjGKQJF81OeMN6XQfkx8oF7PKLe08CjDQ@mail.gmail.com>
Date: Mon, 24 Mar 2025 17:03:03 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: K Prateek Nayak <kprateek.nayak@....com>
Cc: Oleg Nesterov <oleg@...hat.com>,
syzbot <syzbot+62262fdc0e01d99573fc@...kaller.appspotmail.com>, brauner@...nel.org,
dhowells@...hat.com, jack@...e.cz, jlayton@...nel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
netfs@...ts.linux.dev, swapnil.sapkal@....com,
syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk
Subject: Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@....com> wrote:
> So far, with tracing, this is where I'm:
>
> o Mainline + Oleg's optimization reverted:
>
> ...
> kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55
> kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55
> kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7
> kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7
> repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector
> repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work
> repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write
> repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write
> repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0
> repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector
> kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work
> kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
> ... (syzbot reproducer continues)
>
> o Mainline:
>
> kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7
> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7
> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55
> kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55
> repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector
> repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work
> repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write
> repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write
> repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0
> repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> ... (syzbot reproducer hangs)
>
> There is a third "kworker/u1030" component that never gets woken up for
> reasons currently unknown to me with Oleg's optimization. I'll keep
> digging.
>
Thanks for the update.
It is unclear to me if you checked, so I'm going to have to ask just
in case: when there is a hang, is there *anyone* stuck in pipe code
(and if so, where)?
You can get the kernel to print stacks for all threads with sysrq:
echo t > /proc/sysrq-trigger
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists