[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230903231338.GN3390869@ZenIV>
Date: Mon, 4 Sep 2023 00:13:38 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Dave Chinner <david@...morbit.com>
Cc: Mateusz Guzik <mjguzik@...il.com>,
syzbot <syzbot+e245f0516ee625aaa412@...kaller.appspotmail.com>,
brauner@...nel.org, djwong@...nel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-xfs@...r.kernel.org, llvm@...ts.linux.dev, nathan@...nel.org,
ndesaulniers@...gle.com, syzkaller-bugs@...glegroups.com,
trix@...hat.com
Subject: Re: [syzbot] [xfs?] INFO: task hung in __fdget_pos (4)
On Mon, Sep 04, 2023 at 08:27:15AM +1000, Dave Chinner wrote:
> It already is (sysrq-t), but I'm not sure that will help - if it is
> a leaked unlock then nothing will show up at all.
Unlikely; grep and you'll see - very few callers, and for all of them
there's an fdput_pos() downstream of any fdget_pos() that had picked
non-NULL file reference.
In theory, it's not impossible that something had stripped FDPUT_POS_UNLOCK
from the flags, but that's basically "something might've corrupted the
local variables" scenario. There are 12 functions total where we might
be calling fdget_pos() and all of them are pretty small (1 in alpha
osf_sys.c, 6 in read_write.c and 5 in readdir.c); none of those takes
an address of struct fd, none of them has assignments to it after fdget_pos()
and the only accesses to its members are those to fd.file - all fetches.
Control flow is also easy to check - they are all short.
IMO it's much more likely that we'll find something like
thread A:
grabs some fs lock
gets stuck on something
thread B: write()
finds file
grabs ->f_pos_lock
calls into filesystem
blocks on fs lock held by A
thread C: read()/write()/lseek() on the same file
blocks on ->f_pos_lock
Powered by blists - more mailing lists