lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Sep 2009 08:52:19 +0100
From:	Jamie Lokier <jamie@...reable.org>
To:	Eric Paris <eparis@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Evgeniy Polyakov <zbr@...emap.net>,
	David Miller <davem@...emloft.net>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	netdev@...r.kernel.org, viro@...iv.linux.org.uk,
	alan@...ux.intel.com, hch@...radead.org
Subject: Re: fanotify as syscalls

Eric Paris wrote:
> On Tue, 2009-09-15 at 16:49 -0700, Linus Torvalds wrote:
> > And btw, I still want to know what's so wonderful about fanotify that we 
> > would actually want yet-another-filesystem-notification-interface. So I'm 
> > not sayying that I'll take a system call interface.
> 
> The real thing that fanotify provides is an open fd with the event
> rather than some arbitrary 'watch descriptor' that userspace must
> somehow magically map back to data on disk.  This means that it could be
> used to provide subtree notification, which inotify is completely
> incapable of doing.

That's a bit of a spurious claim.

- fanotify does not provide subtree notification in it's
  present form.  When it is extended to do that, why wouldn't
  inotify be as well?  That's an fsnotify feature, common to both.

- fanotify does not provide notification at all for some events that
  you get with inotify.  It is not a superset, so you can't use
  fanotify to provide a subtree-capable equivalent to inotify.  What
  a mess when you need the combination of both features!

- fanotify requires you call readlink(/proc/fd/N) for every event to
  get the path.  It's not a particularly efficient way to get it,
  especially when an apps wants to know if it's something in it's
  region of interest but doesn't care about the actual path.
  When an apps knows it needs the map back to to path, why make it
  slow to get it?  That "extensible data format" is being
  underutilised...

- fanotify's descriptor may be race-prone as a way to get the subtree
  used for access, because any of the parent directories could have
  moved and even been deleted before the app calls
  readlink(/proc/fd/N).  I don't know if a _reliable_ way to track
  changes in a subtree can be built on it.  Maybe it can but it
  appears this hasn't been analysed.  It depends on
  readlink(/proc/fd/N)'s behaviour when the dentry's have been
  changed, among other things.

- Does the descriptor cause umount to fail when user does "do some
  stuff in baz; umount baz", or does it serialise nicely?  That's one
  of inotify's nice features - it doesn't cause umounts to fail.

> And it can be used to provide system wide notification.  We all know
> who wants that.

People who want to break out of chroot/namespace jails using the
conveniently provided open file descriptor? :-)

Seriously, what does system-wide fanotify do when run from a
chroot/namespace/cgroup, and a file outside them is accessed?

If the event is delivered with file desciptor, that's a security hole.
If it's not delivered, that sounds like working subtree support?

I'd expect anti-malware to want to be run inside VMs quite often...

Note that there's no such thing as "the real system root" any more.

> It provides an extensible data format which allows growth impossible in
> inotify.  I don't know if anyone remember the inotify patches which
> wanted to overload the inotify cookie field for some other information,
> but inotify information extension is not reasonable or backwards
> compatible.

I agree with this (although that's what flags are for -- see clone).

I don't have a problem with the next interface being fanotify (despite
arguing a lot); I just want to see the next one being useful for the
things I would otherwise be proposing my own yet-another-interface
for.  So we don't need a fourth one soon after the third due to
easily foreseen limitations.

> I've got private commitments for two very large anti malware companies,
> both of which unprotect and hack syscall tables in their customer's
> kernels, that they would like to move to an fanotify interface.  Both
> Red Hat and Suse have expressed interest in these patches and have
> contributed to the patch set.
> 
> The patch set is actually rather small (entire set of about 20 patches
> is 1800 lines) as it builds on the fsnotify work already in 2.6.31 to
> reuse code from inotify rather than reimplement the same things over and
> over (like we previously had with inotify and dnotify)

I don't have any problem with either of these, and _fs_notify
generally seems like an improvement.  I don't have a problem with
fanotify either.  For what it does, it's ok.

> Don't know what else to say.....

Answer questions about use-cases that you're not interested in?  Why
block them?  What about Evigny's request for an event without an open
fd - because he needs the pid information (inotify doesn't provide)
but not the fd?

Sorry to be so harsh.  I'm really trying to make sure we don't repeat
the mistakes of dnotify and inotify, and end up with a third interface
which also is too restrictive (because it's good enough for your
anti-malware and HSM customers) so that a fourth interface will be
needed soon after.

I'd like to be able to use it from some applications to accelerate
userspace caching of things (faster Make, faster Samba) without
penalising all other applications touching unrelated parts of the
filesystem.  The attitude "you can live with 10% slowdown" worries me.
I'm sure that can be fixed with a bit of care.

If the intention is to maintain fanotify and inotify side-by-side for
different uses (because fanotify returns open descriptors and blocks
the accessing process until acked), that's ok with me.  It makes
sense.  But then it's messy that neither offers a superset of the
other regarding which files and events are tracked.

If it's right that inotify has no room for extensibility (I'm not sure
about this), than it appears we already made a mess with dnotify and
inotify, so it would be a shame to repeat the same mistakes again.
Let's get the next one right, even it takes a bit longer, ok?

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ