netdev - Re: RFC(v2): Audit Kernel Container IDs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f8ea78be-9bbf-2967-7b12-ac93bb85b0bc@schaufler-ca.com>
Date:   Sat, 9 Dec 2017 10:28:08 -0800
From:   Casey Schaufler <casey@...aufler-ca.com>
To:     Mickaël Salaün <mic@...ikod.net>,
        Richard Guy Briggs <rgb@...hat.com>, cgroups@...r.kernel.org,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Linux API <linux-api@...r.kernel.org>,
        Linux Audit <linux-audit@...hat.com>,
        Linux FS Devel <linux-fsdevel@...r.kernel.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Linux Network Development <netdev@...r.kernel.org>
Cc:     mszeredi@...hat.com, "Eric W. Biederman" <ebiederm@...ssion.com>,
        Simo Sorce <simo@...hat.com>, jlayton@...hat.com,
        Carlos O'Donell <carlos@...hat.com>,
        David Howells <dhowells@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Andy Lutomirski <luto@...nel.org>,
        Eric Paris <eparis@...isplace.org>, trondmy@...marydata.com,
        Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: RFC(v2): Audit Kernel Container IDs

On 12/9/2017 2:20 AM, Mickaï¿½l Salaï¿½n wrote:
> On 12/10/2017 18:33, Casey Schaufler wrote:
>> On 10/12/2017 7:14 AM, Richard Guy Briggs wrote:
>>> Containers are a userspace concept.  The kernel knows nothing of them.
>>>
>>> The Linux audit system needs a way to be able to track the container
>>> provenance of events and actions.  Audit needs the kernel's help to do
>>> this.
>>>
>>> Since the concept of a container is entirely a userspace concept, a
>>> registration from the userspace container orchestration system initiates
>>> this.  This will define a point in time and a set of resources
>>> associated with a particular container with an audit container ID.
>>>
>>> The registration is a pseudo filesystem (proc, since PID tree already
>>> exists) write of a u8[16] UUID representing the container ID to a file
>>> representing a process that will become the first process in a new
>>> container.  This write might place restrictions on mount namespaces
>>> required to define a container, or at least careful checking of
>>> namespaces in the kernel to verify permissions of the orchestrator so it
>>> can't change its own container ID.  A bind mount of nsfs may be
>>> necessary in the container orchestrator's mntNS.
>>> Note: Use a 128-bit scalar rather than a string to make compares faster
>>> and simpler.
>>>
>>> Require a new CAP_CONTAINER_ADMIN to be able to carry out the
>>> registration.
>> Hang on. If containers are a user space concept, how can
>> you want CAP_CONTAINER_ANYTHING? If there's not such thing as
>> a container, how can you be asking for a capability to manage
>> them?
>>
>>>   At that time, record the target container's user-supplied
>>> container identifier along with the target container's first process
>>> (which may become the target container's "init" process) process ID
>>> (referenced from the initial PID namespace), all namespace IDs (in the
>>> form of a nsfs device number and inode number tuple) in a new auxilliary
>>> record AUDIT_CONTAINER with a qualifying op=$action field.
> Here is an idea to avoid privilege problems or the need for a new
> capability: make it automatic. What makes a container a container seems
> to be the use of at least a namespace.

You might think so, but I am assured that you can have a container
without using namespaces. Intel's "Clear Containers", which use
virtualization technology, are one example. I have considered creating
"Smack Containers" using mandatory access control technology, more
to press the point that "containers" is a marketing concept, not
technology.

>  What about automatically create
> and assign an ID to a process when it enters a namespace different than
> one of its parent process? This delegates the (permission)
> responsibility to the use of namespaces (e.g. /proc/sys/user/max_* limit).

That gets ugly when you have a container that uses user, filesystem,
network and whatever else namespaces. If all containers used the same
set of namespaces I think this would be a fine idea, but they don't.

> One interesting side effect of this approach would be to be able to
> identify which processes are in the same set of namespaces, even if not
> spawn from the container but entered after its creation (i.e. using
> setns), by creating container IDs as a (deterministic) checksum from the
> /proc/self/ns/* IDs.
>
> Since the concern is to identify a container, I think the ability to
> audit the switch from one container ID to another is enough. I don't
> think we need nested IDs.

Because a container doesn't have to use namespaces to be a container
you still need a mechanism for a process to declare that it is in fact
in a container, and to identify the container.

>
> As a side note, you may want to take a look at the Linux-VServer's XID.
>
> Regards,
>  Mickaï¿½l
>