[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250304203657.GA4063187@perftesting>
Date: Tue, 4 Mar 2025 15:36:57 -0500
From: Josef Bacik <josef@...icpanda.com>
To: Amir Goldstein <amir73il@...il.com>
Cc: Jan Kara <jack@...e.cz>,
syzbot <syzbot+7229071b47908b19d5b7@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, axboe@...nel.dk, brauner@...nel.org,
cem@...nel.org, chandan.babu@...cle.com, djwong@...nel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-xfs@...r.kernel.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@...icpanda.com> wrote:
> >
> > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@...e.cz> wrote:
> > > >
> > > > Josef, Amir,
> > > >
> > > > this is indeed an interesting case:
> > > >
> > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > syzbot has found a reproducer for the following issue on:
> > > > ...
> > > > > ------------[ cut here ]------------
> > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > Modules linked in:
> > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > sp : ffff8000a42569d0
> > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > Call trace:
> > > > > fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > do_read_fault mm/memory.c:5403 [inline]
> > > > > do_fault mm/memory.c:5537 [inline]
> > > > > do_pte_missing mm/memory.c:4058 [inline]
> > > > > handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > new_sync_write fs/read_write.c:586 [inline]
> > > > > vfs_write+0x704/0xa9c fs/read_write.c:679
> > > >
> > > > The backtrace actually explains it all. We had a buffered write whose
> > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > freezing. So we need to think how to deal with this case...
> > >
> > > Ouch. It's like the splice mess all over again.
> > > Except we do not really care to make this use case work with HSM
> > > in the sense that we do not care to have to fill in the mmaped file content
> > > in this corner case - we just need to let HSM fail the access if content is
> > > not available.
> > >
> > > If you remember, in one of my very early version of pre-content events,
> > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > HSM service if it was safe to write to fs in the context of event handling.
> > >
> > > At the moment, I cannot think of any elegant way out of this use case
> > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > This will relax the debugging code assertion and notify the HSM service
> > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > this case to HSM service.
> > >
> > > WDYT?
> >
> > I think that mmap was a mistake.
>
> What do you mean?
> Isn't the fault hook required for your large executables use case?
I mean the mmap syscall was a mistake ;).
>
> >
> > Is there a way to tell if we're currently in a path that is under fsfreeze
> > protection?
>
> Not at the moment.
> At the moment, file_write_not_started() is not a reliable check
> (has false positives) without CONFIG_LOCKDEP.
>
> > Just denying this case would be a simpler short term solution while
> > we come up with a long term solution. I think your solution is fine, but I'd be
> > just as happy with a simpler "this isn't allowed" solution. Thanks,
>
> Yeh, I don't mind that, but it's a bit of an overkill considering that
> file with no content may in fact be rare.
Agreed, I'm fine with your solution. Thanks,
Josef
Powered by blists - more mailing lists