[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fbd6f22-2086-4cf7-923f-ac95688c8ddf@amd.com>
Date: Mon, 24 Mar 2025 21:55:45 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Mateusz Guzik <mjguzik@...il.com>
CC: Oleg Nesterov <oleg@...hat.com>, syzbot
<syzbot+62262fdc0e01d99573fc@...kaller.appspotmail.com>,
<brauner@...nel.org>, <dhowells@...hat.com>, <jack@...e.cz>,
<jlayton@...nel.org>, <linux-fsdevel@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <netfs@...ts.linux.dev>,
<swapnil.sapkal@....com>, <syzkaller-bugs@...glegroups.com>,
<viro@...iv.linux.org.uk>
Subject: Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
Hello Mateusz,
On 3/24/2025 9:33 PM, Mateusz Guzik wrote:
> On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@....com> wrote:
>> So far, with tracing, this is where I'm:
>>
>> o Mainline + Oleg's optimization reverted:
>>
>> ...
>> kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55
>> kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55
>> kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7
>> kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7
>> repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector
>> repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work
>> repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write
>> repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write
>> repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0
>> repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>> kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector
>> kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work
>> kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
>> ... (syzbot reproducer continues)
>>
>> o Mainline:
>>
>> kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7
>> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7
>> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55
>> kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55
>> repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector
>> repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work
>> repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write
>> repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write
>> repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0
>> repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>> ... (syzbot reproducer hangs)
>>
>> There is a third "kworker/u1030" component that never gets woken up for
>> reasons currently unknown to me with Oleg's optimization. I'll keep
>> digging.
>>
>
> Thanks for the update.
>
> It is unclear to me if you checked, so I'm going to have to ask just
> in case: when there is a hang, is there *anyone* stuck in pipe code
> (and if so, where)?
>
> You can get the kernel to print stacks for all threads with sysrq:
> echo t > /proc/sysrq-trigger
This dumps an insane amount of stuff on my console. Let me search if there
is any reference to pipe somewhere in there. Meanwhile, for the reproducer
threads themself, they are at:
# ps aux | grep repro
root 4245 0.0 0.0 19040 0 ? D 16:10 0:00 ./repro
root 4306 0.0 0.0 19172 0 pts/5 S+ 16:13 0:00 ./repro
root 4307 0.0 0.0 19040 0 pts/5 D 16:13 0:00 ./repro
# cat /proc/4306/stack
[<0>] do_wait+0xa9/0x100
[<0>] kernel_wait4+0xa9/0x150
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
# cat /proc/4307/stack
[<0>] do_lock_mount+0x37/0x200
[<0>] path_mount+0x876/0xae0
[<0>] __x64_sys_mount+0x103/0x130
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
# cat /proc/4245/stack
[<0>] netfs_unbuffered_write_iter_locked+0x222/0x2b0 [netfs]
[<0>] netfs_unbuffered_write_iter+0x12d/0x1f0 [netfs]
[<0>] vfs_write+0x307/0x420
[<0>] ksys_write+0x66/0xe0
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
For kworkers, I did:
# for i in $(ps aux | grep kworker | tr -s ' ' | cut -d ' ' -f2); do cat /proc/$i/stack | grep pipe; done
and I got no output. Same for the string "wait".
>
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists