linux-kernel - Re: [syzbot] [xfs?] INFO: task hung in __fdget

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230903231338.GN3390869@ZenIV>
Date:   Mon, 4 Sep 2023 00:13:38 +0100
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Dave Chinner <david@...morbit.com>
Cc:     Mateusz Guzik <mjguzik@...il.com>,
        syzbot <syzbot+e245f0516ee625aaa412@...kaller.appspotmail.com>,
        brauner@...nel.org, djwong@...nel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-xfs@...r.kernel.org, llvm@...ts.linux.dev, nathan@...nel.org,
        ndesaulniers@...gle.com, syzkaller-bugs@...glegroups.com,
        trix@...hat.com
Subject: Re: [syzbot] [xfs?] INFO: task hung in __fdget_pos (4)

On Mon, Sep 04, 2023 at 08:27:15AM +1000, Dave Chinner wrote:

> It already is (sysrq-t), but I'm not sure that will help - if it is
> a leaked unlock then nothing will show up at all.

Unlikely; grep and you'll see - very few callers, and for all of them
there's an fdput_pos() downstream of any fdget_pos() that had picked
non-NULL file reference.

In theory, it's not impossible that something had stripped FDPUT_POS_UNLOCK
from the flags, but that's basically "something might've corrupted the
local variables" scenario.  There are 12 functions total where we might
be calling fdget_pos() and all of them are pretty small (1 in alpha
osf_sys.c, 6 in read_write.c and 5 in readdir.c); none of those takes
an address of struct fd, none of them has assignments to it after fdget_pos()
and the only accesses to its members are those to fd.file - all fetches.
Control flow is also easy to check - they are all short.

IMO it's much more likely that we'll find something like

thread A:
	grabs some fs lock
	gets stuck on something
thread B: write()
	finds file
	grabs ->f_pos_lock
	calls into filesystem
	blocks on fs lock held by A
thread C: read()/write()/lseek() on the same file
	blocks on ->f_pos_lock