[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110820032106.GB14899@jl-vm1.vm.bytemark.co.uk>
Date: Sat, 20 Aug 2011 04:21:06 +0100
From: Jamie Lokier <jamie@...reable.org>
To: Sylvain Rochet <gradator@...dator.net>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-nfs@...r.kernel.org
Subject: Re: PROBLEM: 2.6.35.7 to 3.0 Inotify events missing
Sylvain Rochet wrote:
> Hi Jamie,
>
>
> On Sat, Aug 20, 2011 at 12:37:56AM +0100, Jamie Lokier wrote:
> >
> > Oh dear, that's a security hole, if something is using inotify/dnotify
> > to watch and assumes that file contents (on the same machine,
> > i.e. server in this case) do not change if there's no event received.
> >
> > It also breaks cache applications which make the same assumption.
> >
> > I do quite like the idea of using it to break past fanotify security
> > restrictions though ;-)
>
> It also probably means that fanotify misses some events when a filesystem
> is modified over NFS. If fanotify is used the way it is designed, i.e.
> with an antiviruse software, this may be an interesting way to skip the
> antiviruse check.
>
> Here we go:
>
> NFS server, run the fanotify example tool:
>
> ~/fanotify-example# ./fanotify -m /data/
>
> NFS client, open a fd then do some I/O:
>
> # exec 1> test
> # ls -la
> #
>
> NFS server log:
>
> /data/test: pid=1235 modify close(writable)
>
> NFS server, cache clearing:
>
> # echo 3 > /proc/sys/vm/drop_caches
>
> NFS client, more I/O:
>
> # ls -la
>
> NFS server log:
>
> /data: pid=1234 modify close(writable)
>
> We receive an event... which is obviously wrong. This is even worse than
> no event at all, we receive an event about the wrong inode, the parent
> inode of the modified file actually.
That sounds like a proper bug, maybe it can be fixed at least?
> > Is a solution to open inotify watches on every file individually? If
> > so that seems quite severe.
>
> This is what I am going to do, at least temporarily, I only need to
> watch about a million file (and slowly counting).
>
> The startup time to watch an entire filesystem using inotify already
> require a full filesystem walk, watching all files and directories
> instead of directories only will not change much because most of the
> time is spent waiting I/O operations. This may however require a lot
> more memory both on kernel side and userland side.
Watching an entire filesystem entails reading all the directories, but
you don't have to fetch the inodes of files. But still, it's very
slow (takes about 15 minutes on my /home from cold cache, just to read
the million or so directories).
There was some work on propagating events upwards so that efficient
recursive watches could be established, in the context of fanotify but
it would make sense to be available to all fsnotify users. I wonder
how that went.
> > Then this can be solved, in principle (if there's no better way), by
> > watching a "virtual directory" that gets all events for when the
> > access doesn't have a parent directory. There needs to be some way to
> > watch it, and some way to get the appropriate file from the event (as
> > there is no real directory. Or maybe there could be a virtual
> > filesystem (like /proc, /sys etc.) containing a magic directory that
> > receives these inode-only events, such that lookups in that directory
> > yield the affected file. Exactly as if the directory contains a hard
> > link to every file, perhaps a text encoding of the handles passed
> > through sys_open_by_handle_at.
>
> By doing that, we'll only get the inode nb as we cannot fetch the filename.
Yes... That's ok if it's one we are tracking inode->multiple-paths in
userspace anyway (for hard links). But it's quite demanding if we
hoped to avoid fetching and storing that in userspace for
st_nlink == 1 files.
In that case it is still better to get a notification "something
unknown on this FS has changed", rather than no notification.
Userspace would react by flushing all of its cached knowledge of
things under directory watches that don't have direct watches. But at
least that's reliable and correct behaviour, and if it happens often,
userspace heuristics can react by watching priority inodes more directly.
If that's the common case, then these nameless, pathless events could
just trigger a simple event with catch-all IN_NO_PATH flag set,
referring to the filesytem but no more detail than that. inotify
would accept that flag when adding a watch, ignore the inode given but
remember the filesystem, and send all events with no path to the
watch(es) created with that flag on that filesystem. It's a flag
because the event type is still useful.
All the best,
-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists