lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 06 Mar 2016 15:53:40 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"Serge E. Hallyn" <serge.hallyn@...ntu.com>
Cc:	lkml <linux-kernel@...r.kernel.org>,
	Seth Forshee <seth.forshee@...onical.com>,
	Stéphane Graber <stgraber@...ntu.com>,
	serge@...lyn.com, Andy Lutomirski <luto@...capital.net>
Subject: Re: user namespace and fully visible proc and sys mounts

"Serge E. Hallyn" <serge.hallyn@...ntu.com> writes:

> Hi,
>
> So we've been over this many times...  but unfortunately there is more
> breakage to report.  Regular privileged and unprivileged containers
> work all right for us.  But running an unprivileged container inside a
> privileged container is blocked.
>
> When creating privileged containers, lxc by default does a few things:
> it mounts some fuse.lxcfs files over procfiles include /proc/meminfo and
> /proc/uptime.  It mounts proc rw but /proc/sysrq-trigger ro as well as
> moves /proc/sys/net out of the way, bind-mounts /proc/sys readonly
> (because this container is not in a user namespace) then moves
> /proc/sys/net back.  Finally it mounts sys ro but bind-mounts
> /sys/devices/virtual/net as writeable.
>
> If any of these are left enabled, unprivileged containers can't be
> started.  If all are disabled, then they can be.
>
> Can we find a way to make these not block remounts in child user
> namespaces?  A boot flag, a procfs and sysfs mount option, a sysctl?

Are any of these overmounts done for the purpose of security?  It
appears the /proc/sys and /sys mounts being made read-only is for that
purpose.

If none of the mounts are for secuirty the easy solution that works
today is to also mount /proc and /sys somewhere else in your container
so that the permission check for mounting a new copy passes.

That said /proc/sys appears to be a show stopper in this scheme.  As the
root of your privileged container can enter your unprivileged container
it can bypass your read-only /proc/sys by mounting a new copy of proc if
we allow the relaxation you are requesting.

Therefore the only choice on the table (and I don't have a clue how
realistic it is) is to have a variant of proc with just files describing
processes.  Call it processfs.  That would not need the current
restrictions.

As for sysfs I am drawing a blank about what might be possible.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ