lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <550503.1690588340@warthog.procyon.org.uk>
Date:   Sat, 29 Jul 2023 00:52:20 +0100
From:   David Howells <dhowells@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     dhowells@...hat.com,
        syzbot <syzbot+f527b971b4bdc8e79f9e@...kaller.appspotmail.com>,
        bpf@...r.kernel.org, brauner@...nel.org, davem@...emloft.net,
        dsahern@...nel.org, edumazet@...gle.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, pabeni@...hat.com,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk
Subject: Re: [syzbot] [fs?] INFO: task hung in pipe_release (4)

Jakub Kicinski <kuba@...nel.org> wrote:

> Hi David, any ideas about this one? Looks like it triggers on fairly
> recent upstream?

I've managed to reproduce it finally.  Instrumenting the pipe_lock/unlock
functions, splice_to_socket() and pipe_release() seems to show that
pipe_release() is being called whilst splice_to_socket() is still running.

I *think* syzbot is arranging things such that splice_to_socket() takes a
significant amount of time so that another thread can close the socket as it
exits.

In this sample logging, the pipe is created by pid 7101:

[   66.205719] --pipe 7101
[   66.209942] lock
[   66.212526] locked
[   66.215344] unlock
[   66.218103] unlocked

splice begins in 7101 also and locks the pipe:

[   66.221057] ==>splice_to_socket() 7101
[   66.225596] lock
[   66.228177] locked

but for some reason, pid 7100 then tries to release it:

[   66.377781] release 7100

and hangs on the __pipe_lock() call in pipe_release():

[   66.381059] lock

The syz reproducer does weird things with threading - and I'm wondering if
there's a file struct refcount bug here.  Note that splice_to_socket() can't
access the pipe file structs to alter the refcount, and the involved pipe
isn't communicated to udp_sendmsg() in any way - so if there is a refcount
bug, it must be somewhere in the VFS, the pipe driver or the splice
infrastructure:-/.

I'm also not sure what's going on inside udp_sendmsg() as yet.  It doesn't
show a stack in /proc/7101/stacks, which means it doesn't hit a schedule().

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ