[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whw1cnWPSWw1tHXT5xTMbnJFbCtx9_re6U1RZ_7LE9gXA@mail.gmail.com>
Date: Sun, 31 Mar 2019 16:40:15 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Christian Brauner <christian@...uner.io>
Cc: Andy Lutomirski <luto@...capital.net>,
Daniel Colascione <dancol@...gle.com>,
Jann Horn <jannh@...gle.com>,
Andrew Lutomirski <luto@...nel.org>,
David Howells <dhowells@...hat.com>,
"Serge E. Hallyn" <serge@...lyn.com>,
Linux API <linux-api@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Arnd Bergmann <arnd@...db.de>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
Kees Cook <keescook@...omium.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Michael Kerrisk-manpages <mtk.manpages@...il.com>,
Jonathan Kowalski <bl0pbl33p@...il.com>,
"Dmitry V. Levin" <ldv@...linux.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Nagarathnam Muthusamy <nagarathnam.muthusamy@...cle.com>,
Aleksa Sarai <cyphar@...har.com>,
Al Viro <viro@...iv.linux.org.uk>,
Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
On Sun, Mar 31, 2019 at 3:16 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> I would actually suggest we just make the rules be that the
> pidfd_open() always return the internal /proc entry regardless of any
> mount-point (or any "hidepid")
The clever alternative, which might be the RightWay(tm) is to just
create a completely unattached dentry, and basically tie it into the
actual /proc filesystem hierarchy at lookup() time when somebody does
the openat() using it for the first time. You'd get a very simple
callback: since the dentry would be unattached, you'd be guaranteed to
get a "lookup()" from the VFS layer, and that lookup would then do the
"hook into the actual /proc filesystem".
We already kind of do things like that in the VFS layer when we have
unattached dentries (because of "look up by filehandle" etc). In many
ways the "pidfd_open()" really is exactly a "look up by file handle"
operation, it just so happens that the "file handle" is just the
pid/namespace combination.
And if the splice alias (which is what the VFS layer calls that "tie
aliased dentry to the parent" operation) fails, because the /proc
filesystem isn't mounted or whatever, then trying to look up names off
the thing will also fail.
It's a tiny bit too clever for my taste, and it's not *exactly* the
same thing as our normal inode alias handling, but it's pretty close
conceptually (and even practically).
So it would basically do something very similar to the ioctl, but it
would do it implicitly and automatically at that first lookup.
That would also mean that you'd not actually pay the cost of doing any
of this *unless* you also end up trying to open things in /proc using
that pidfd.
Linus
Powered by blists - more mailing lists