[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112142812.37e3hyar3zqoqo5a@quack3>
Date: Tue, 12 Nov 2024 15:28:12 +0100
From: Jan Kara <jack@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Amir Goldstein <amir73il@...il.com>, Josef Bacik <josef@...icpanda.com>,
kernel-team@...com, linux-fsdevel@...r.kernel.org, jack@...e.cz,
brauner@...nel.org, linux-xfs@...r.kernel.org,
linux-btrfs@...r.kernel.org, linux-mm@...ck.org,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 06/17] fsnotify: generate pre-content permission event
on open
On Mon 11-11-24 16:37:30, Linus Torvalds wrote:
> On Mon, 11 Nov 2024 at 16:00, Amir Goldstein <amir73il@...il.com> wrote:
> >
> > I think that's a good idea for pre-content events, because it's fine
> > to say that if the sb/mount was not watched by a pre-content event listener
> > at the time of file open, then we do not care.
>
> Right.
>
> > The problem is that legacy inotify/fanotify watches can be added after
> > file is open, so that is allegedly why this optimization was not done for
> > fsnotify hooks in the past.
>
> So honestly, even if the legacy fsnotify hooks can't look at the file
> flag, they could damn well look at an inode flag.
>
> And I'm not even convinced that we couldn't fix them to just look at a
> file flag, and say "tough luck, somebody opened that file before you
> started watching, you don't get to see what they did".
>
> So even if we don't look at a file->f_mode flag, the lergacy cases
> should look at i_fsnotify_mask, and do that *first*.
>
> IOW, not do it like fsnotify_object_watched() does now, which is just
> broken. Again, it looks at inode->i_sb->s_fsnotify_mask completely
> pointlessly, but it also does it much too late - it gets called after
> we've already called into the fsnotify() code and have messed up the
> I$ etc.
>
> The "linode->i_sb->s_fsnotify_mask" is not only an extra indirection,
> it should be very *literally* pointless. If some bit isn't set in
> i_sb->s_fsnotify_mask, then there should be no way to set that bit in
> inode->i_fsnotify_mask. So the only time we should access
> i_sb->s_fsnotify_mask is when i_notify_mask gets *modified*, not when
> it gets tested.
Thanks for explanation but what you write here and below seems to me as if
you are not aware of some features fanotify API provides (for many years).
With fanotify you can place a mark not only on a file / directory but you
can place it also on a mount point or a superblock. In that case any
operation of appropriate type happening through that mount point or on that
superblock should deliver the event to the notification group that placed
the mark. So for example we check inode->i_sb->s_fsnotify_mask on open to
be able to decide whether somebody asked to get notifications about *all*
opens on that superblock and generate appropriate events. These features
are there since day 1 of fanotify (at least for mountpoints, superblocks
were added later) and quite some userspace depends on them for doing
filesystem-wide event monitoring.
We could somehow cache the fact that someone placed a mark on the
superblock in the inode / struct file but I don't see how to keep such
cache consistent with marks that are attached to the superblock without
incurring the very same cache misses we are trying to avoid.
I do like your idea of caching whether somebody wants the notification in
struct file as that way we can avoid the cache misses for the
new pre-content hooks and possibly in hooks for the traditional fanotify
permission events. But I don't see how we could possibly avoid these cache
misses for the classical notification events like FAN_OPEN, FAN_ACCESS,
FAN_MODIFY, ...
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists