[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200331122554.GA27469@gardel-login>
Date: Tue, 31 Mar 2020 14:25:54 +0200
From: Lennart Poettering <mzxreary@...inter.de>
To: Miklos Szeredi <miklos@...redi.hu>
Cc: Karel Zak <kzak@...hat.com>,
Christian Brauner <christian.brauner@...ntu.com>,
David Howells <dhowells@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>, dray@...hat.com,
Miklos Szeredi <mszeredi@...hat.com>,
Steven Whitehouse <swhiteho@...hat.com>,
Jeff Layton <jlayton@...hat.com>, Ian Kent <raven@...maw.net>,
andres@...razel.de, keyrings@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Aleksa Sarai <cyphar@...har.com>
Subject: Re: Upcoming: Notifications, FS notifications and fsinfo()
On Di, 31.03.20 10:56, Miklos Szeredi (miklos@...redi.hu) wrote:
> On Tue, Mar 31, 2020 at 10:34 AM Karel Zak <kzak@...hat.com> wrote:
> >
> > On Tue, Mar 31, 2020 at 07:11:11AM +0200, Miklos Szeredi wrote:
> > > On Mon, Mar 30, 2020 at 11:17 PM Christian Brauner
> > > <christian.brauner@...ntu.com> wrote:
> > >
> > > > Fwiw, putting down my kernel hat and speaking as someone who maintains
> > > > two container runtimes and various other low-level bits and pieces in
> > > > userspace who'd make heavy use of this stuff I would prefer the fd-based
> > > > fsinfo() approach especially in the light of across namespace
> > > > operations, querying all properties of a mount atomically all-at-once,
> > >
> > > fsinfo(2) doesn't meet the atomically all-at-once requirement.
> >
> > I guess your /proc based idea have exactly the same problem...
>
> Yes, that's exactly what I wanted to demonstrate: there's no
> fundamental difference between the two API's in this respect.
>
> > I see two possible ways:
> >
> > - after open("/mnt", O_PATH) create copy-on-write object in kernel to
> > represent mount node -- kernel will able to modify it, but userspace
> > will get unchanged data from the FD until to close()
> >
> > - improve fsinfo() to provide set (list) of the attributes by one call
>
> I think we are approaching this from the wrong end. Let's just
> ignore all of the proposed interfaces for now and only concentrate on
> what this will be used for.
>
> Start with a set of use cases by all interested parties. E.g.
>
> - systemd wants to keep track attached mounts in a namespace, as well
> as new detached mounts created by fsmount()
>
> - systemd need to keep information (such as parent, children, mount
> flags, fs options, etc) up to date on any change of topology or
> attributes.
- We also have code that recursively remounts r/o or unmounts some
directory tree (with filters), which is currently nasty to do since
the relationships between dirs are not always clear from
/proc/self/mountinfo alone, in particular not in an even remotely
atomic fashion, or when stuff is overmounted.
- We also have code that needs to check if /dev/ is plain tmpfs or
devtmpfs. We cannot use statfs for that, since in both cases
TMPFS_MAGIC is reported, hence we currently parse
/proc/self/mountinfo for that to find the fstype string there, which
is different for both cases.
Lennart
--
Lennart Poettering, Berlin
Powered by blists - more mailing lists