linux-kernel - Re: [RFC] Add option to mount only a pids subset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170313132732.GR29622@ZenIV.linux.org.uk>
Date:   Mon, 13 Mar 2017 13:27:33 +0000
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Andy Lutomirski <luto@...capital.net>
Cc:     Alexey Gladkov <gladkov.alexey@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Vasiliy Kulikov <segoon@...nwall.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Pavel Emelyanov <xemul@...allels.com>,
        James Bottomley <James.Bottomley@...senpartnership.com>
Subject: Re: [RFC] Add option to mount only a pids subset

On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote:
> On Sat, Mar 11, 2017 at 6:13 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly
> > expose the full thing.  And as for the lifetimes making no sense...
> > note that you are simply not freeing these structures of yours.
> > Try to handle that and you'll get a serious PITA all over the
> > place.
> >
> > What are you trying to achieve, anyway?  Why not add a second vfsmount
> > pointer per pid_namespace and make it initialized on demand, at the
> > first attempt of no-pid mount?  Just have a separate no-pid instance
> > created for those namespaces where it had been asked for, with
> > separate superblock and dentry tree not containing anything other
> > that pid-only parts + self + thread-self...
> 
> Can't we just make procfs work like most other filesystems and have
> each mount have its own superblock?  If we need to do something funky
> to stat() output to keep existing userspace working, I think that's
> okay.

First of all, most of the filesystems do *NOT* guarantee anything of
that sort.  And what's the point of having more instances than
necessary, anyway?

> As far as I can tell, proc_mnt is very nearly useless -- it seems to
> be used for proc_flush_task (which claims to be purely an optimization
> and could be preserved in the common case where there's only one
> relevant mount) and for sysctl_binary.  For the latter, we could
> create proc_mnt but make actual user-initiated mounts be new
> superblocks anyway.

Again, what for?  It won't salvage that kludge...  It's not as if it
had been hard to have separate pid-only instance created when asked
for (and reused every time when we are asked for pid-only).  What's
the point of ever having more than two instances per pidns?  IDGI...

Folks, there is no one-to-one correspondence between mountpoints and
superblocks.  Not since 2000 or so.  Just don't try to shove your
per-superblock stuff into vfsmount; it simply won't work.  If you
want a separate instance for that thing, then just go ahead and
have ->mount() decide which one to use (and whether to create a new
one).  All there is to it...