linux-kernel - Re: Using devices in Containers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALRD3qKU5gOi5K53ysOgF9Re+MjMLfcTYEK_3gq3ns2HNLkUFw@mail.gmail.com>
Date:	Thu, 25 Sep 2014 10:40:10 -0500
From:	riya khanna <riyakhanna1983@...il.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	LXC development mailing-list 
	<lxc-devel@...ts.linuxcontainers.org>,
	Miklos Szeredi <miklos@...redi.hu>,
	fuse-devel <fuse-devel@...ts.sourceforge.net>,
	Tejun Heo <tj@...nel.org>,
	Seth Forshee <seth.forshee@...onical.com>,
	linux-kernel@...r.kernel.org,
	Serge Hallyn <serge.hallyn@...ntu.com>
Subject: Re: Using devices in Containers

Is there a plan or work-in-progress to add namespace tags to other
classes in sysfs similar to net? Does it make sense to add namespace
tags to kobjects?

-Riya

On Wed, Sep 24, 2014 at 7:25 PM, riya khanna <riyakhanna1983@...il.com> wrote:
> On Wed, Sep 24, 2014 at 5:38 PM, Eric W. Biederman <ebiederm@...ssion.com>
> wrote:
>>
>> Riya Khanna <riyakhanna1983@...il.com> writes:
>>
>> > On Sep 24, 2014, at 12:43 PM, Eric W. Biederman <ebiederm@...ssion.com>
>> > wrote:
>> >
>> >> Serge Hallyn <serge.hallyn@...ntu.com> writes:
>> >>
>> >>> Isolation is provided by the devices cgroup.  You want something more
>> >>> than isolation.
>> >>>
>> >>> Quoting riya khanna (riyakhanna1983@...il.com):
>> >>>> My use case for having device namespaces is device isolation. Isn't
>> >>>> what
>> >>>> namespaces are there for (as I understand)?
>> >>
>> >> Namespaces fundamentally provide for using the same ``global'' name
>> >> in different contexts.  This allows them to be used for isolation
>> >> and process migration (because you can take the same name from
>> >> machine to machine).
>> >>
>> >> Unless someone cares about device numbers at a namespace level
>> >> the work is done.
>> >>
>> >> The mount namespace provides exsits to deal with file names.
>> >> The devices cgroup will limit which devices you can access (although
>> >> I can't ever imagine a case where the mout namespace would be
>> >> insufficient).
>> >>
>> >>>> Not everything should be
>> >>>> accessible (or even visible) from a container all the time (we have
>> >>>> seen
>> >>>> people come up with different use cases for this). However,
>> >>>> bind-mounting
>> >>>> takes away this flexibility.
>> >>
>> >> I don't see how.  If they are mounts that propogate into the container
>> >> and are controlled from outside you can do whatever you want.  (I am
>> >> imagining device by device bind mounts here).  It should be trivial
>> >> to have a a directory tree that propogates into a container and works.
>> >>
>> >
>> > Device-by-device bind mounts can grant/revoke access to real
>> > individual devices as and when needed. However, revoking the access to
>> > real devices could break the applications if there’s no transparent
>> > mechanism to back up the propagated (but now revoked) device bind
>> > mounts that could fool the apps into believing that they are working
>> > with real devices. Frame buffer is one such example, where safe
>> > multiplexing could be applied.
>> >
>> >>>> I agree that assigning fixed device numbers is
>> >>>> clearly not a long-term solution. Emulation for safe and flexible
>> >>>> multiplexing, like you suggested either using CUSE/FUSE or something
>> >>>> like
>> >>>> devpts, is what I'm exploring.
>> >>
>> >> Is the problem you actually care about multiplexing devices?
>> >>
>> >
>> > The problem I care about is access to real devices, such as input, fb,
>> > loop, etc. as and when needed, thereby having native I/O performance -
>> > either through secure multiplexing or exclusive ownership, whatever
>> > makes sense according to the device type.
>>
>> Riya Khanna <riyakhanna1983@...il.com> writes:
>>
>> > I guess policy-based multiplexing (or exclusive ownership) is the
>> > usage. What kind of devices (loop, fb, etc.) this is needed for
>> > depends on the usage. If there are multiple FBs, then each container
>> > could potentially own one. One may want to provide exclusive ownership
>> > of input devices to one container at a time to avoid information
>> > leakage. Like we saw at LPC last year, this applies to sensors (gps,
>> > accelerometer, etc.) on mobile devices as well.
>>
>> Allowing mutiplexing of those devices seems reasonable.
>>
>> Where the discussion ran into problems last time was that people did not
>> want to use any of the existing linux solutions for multiplexing those
>> kind of thing and wanted to invent something new.
>>
>> Inventing something new is fine if it the extra code maintenance can be
>> justified, or if the invention just a better solution for all users and
>> new code can just start using that in general.
>>
>> The old solution to your problem of multiplexing devices is by
>> allocating a virtual terminal nd sending signals to coordinate
>> cooperatively sharing those resources.
>>
>> If you want some sort of preemtive multitasking that requires
>> something a bit more effort, and work in the device abstractions.
>> You may be able to share concepts and library code but I don't believe
>> there is something you can just pain on top of devices and make it
>> happen.  Certainly in the bad old days of X terminal switching the
>> cooperation was necessary so that when a video card was yanked from an
>> application writing directly to that video card the application would
>> need to restore the video card to a known state so the next application
>> would have a chance of making sense of it.   Furthermore most devices
>> are not safe to let unprivileged users to access their control registers
>> directly.
>>
>> All of which boils down the simple fact that for each type of device you
>> would like to share it is necessary to update the subsystem to support
>> arbitrary numbers of virtual devices that you can talk to.
>>
>> The macvlan driver in the networking stack is a rough example of what I
>> expect you would like.  Something that takes one real physical device
>> and turns it into N virtual devices each of which runs at effectively
>> full speed.  Along with some kind of new master interface for
>> controlling when the multiplexing takes place.
>>
>> I think we do most of this is software today and arguably for a lot of
>> devices the overhead is small enough that a software solution is fine.
>> So perhaps all you need is a fuse interface to the existing software
>> multiplexers so that weird legacy code can be made to run.
>>
>
> What kind of existing multiplexers could be used? Is there one for fb? We
> have evdev abstractions for input in place already.
>
>> Now I suspect part of doing this right will be getting proper video
>> drivers on Android.  I assume that Android is the platform you care
>> about.
>>
>> Eric
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/