[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080306031525.GA9070@sergelap.austin.ibm.com>
Date: Wed, 5 Mar 2008 21:15:25 -0600
From: "Serge E. Hallyn" <serue@...ibm.com>
To: Greg KH <greg@...ah.com>
Cc: Pavel Emelyanov <xemul@...nvz.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Paul Menage <menage@...gle.com>,
Sukadev Bhattiprolu <sukadev@...ibm.com>,
Serge Hallyn <serue@...ibm.com>
Subject: Re: [PATCH 0/9] Devices accessibility control group (v4)
Quoting Greg KH (greg@...ah.com):
> On Wed, Mar 05, 2008 at 08:23:35PM +0300, Pavel Emelyanov wrote:
> > Changes from v3:
> > * Ported on 2.6.25-rc3-mm1;
> > * Re-splitted into smaller pieces;
> > * Added more comments to tricky places.
> >
> > This controller allows to tune the devices accessibility by tasks,
> > i.e. grant full access for /dev/null, /dev/zero etc, grant read-only
> > access to IDE devices and completely hide SCSI disks.
>
> From within the kernel itself? The kernel should not be keeping track
> of the mode of devices, that's what the filesystem holding /dev is for.
> Those modes change all the time depending on the device plugged in, and
> the user using the "console". Why should the kernel need to worry about
> any of this?
These are distinct from the permissions on device files. No matter what
the permissions on the device files, a task in a devcg cgroup which
isn't allowed write to chardev 4:64 will not be able to write to
/dev/ttyS0.
The purpose is to prevent a root task from granting itself access to
certain devices. Without this, the only option currently is to take
CAP_MKNOD out of the capability bounding set for a container and make
sure that /dev is set up right (and enforce nodev for mounts). In
itself that doesn't sound so bad and it was my preference at first, but
the argument is that things like udev should be able to run in a
container, and will object about not being able to create devices.
> > Tasks still can call mknod to create device files, regardless of
> > whether the particular device is visible or accessible, but they
> > may not be able to open it later.
> >
> > This one hides under CONFIG_CGROUP_DEVS option.
> >
> > To play with it - run a standard procedure:
> >
> > # mount -t container none /cont/devs -o devices
> > # mkdir /cont/devs/0
> > # echo -n $$ > /cont/devs/0/tasks
>
> What is /cont/ for?
cgroups used to be called containers, so 'cont' is presumably shorthand
for container.
> > and tune device permissions.
>
> How is this done?
>
> Why would the kernel care about this stuff?
Because there is no way for userspace to restrict a root process in a
container from accessing whatever devices it wants.
> confused,
>
> greg k-h
thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists