[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 13 Dec 2016 07:18:27 +1300
From: ebiederm@...ssion.com (Eric W. Biederman)
To: "Michael Kerrisk \(man-pages\)" <mtk.manpages@...il.com>
Cc: Andrei Vagin <avagin@...nvz.org>,
Containers <containers@...ts.linux-foundation.org>,
Linux API <linux-api@...r.kernel.org>,
lkml <linux-kernel@...r.kernel.org>,
"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@...r.kernel.org>,
James Bottomley <James.Bottomley@...senpartnership.com>,
"W. Trevor King" <wking@...mily.us>,
Alexander Viro <viro@...iv.linux.org.uk>,
"Serge E. Hallyn" <serge@...lyn.com>
Subject: Re: Documenting the ioctl interfaces to discover relationships between namespaces
"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com> writes:
> On 12/11/2016 11:30 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com> writes:
>>
>>> [was: [PATCH 0/4 v3] Add an interface to discover relationships
>>> between namespaces]
>>
>> One small comment below.
>>
>>>
>>> Introspecting namespace relationships
>>> Since Linux 4.9, two ioctl(2) operations are provided to allow
>>> introspection of namespace relationships (see user_namespaces(7)
>>> and pid_namespaces(7)). The form of the calls is:
>>>
>>> ioctl(fd, request);
>>>
>>> In each case, fd refers to a /proc/[pid]/ns/* file.
>>>
>>> NS_GET_USERNS
>>> Returns a file descriptor that refers to the owning user
>>> namespace for the namespace referred to by fd.
>>>
>>> NS_GET_PARENT
>>> Returns a file descriptor that refers to the parent names‐
>>> pace of the namespace referred to by fd. This operation is
>>> valid only for hierarchical namespaces (i.e., PID and user
>>> namespaces). For user namespaces, NS_GET_PARENT is synony‐
>>> mous with NS_GET_USERNS.
>>>
>>> In each case, the returned file descriptor is opened with O_RDONLY
>>> and O_CLOEXEC (close-on-exec).
>>>
>>> By applying fstat(2) to the returned file descriptor, one obtains
>>> a stat structure whose st_ino (inode number) field identifies the
>>> owning/parent namespace. This inode number can be matched with
>>> the inode number of another /proc/[pid]/ns/{pid,user} file to
>>> determine whether that is the owning/parent namespace.
>>
>> Like all fstat inode comparisons to be fully accurate you need to
>> compare both the st_ino and st_dev. I reserve the right for st_dev to
>> be significant when comparing namespaces. Otherwise I might have to
>> create a namespace of namespaces someday and that is ugly.
>>
>>> Either of these ioctl(2) operations can fail with the following
>>> error:
>>>
>>> EPERM The requested namespace is outside of the caller's names‐
>>> pace scope. This error can occur if, for example, the own‐
>>> ing user namespace is an ancestor of the caller's current
>>> user namespace. It can also occur on attempts to obtain
>>> the parent of the initial user or PID namespace.
>>>
>>> Additionally, the NS_GET_PARENT operation can fail with the fol‐
>>> lowing error:
>>>
>>> EINVAL fd refers to a nonhierarchical namespace.
>>>
>>> See the EXAMPLE section for an example of the use of these opera‐
>>> tions.
>
> So, after playing with this a bit, I have a question.
>
> I gather that in order to, for example, elaborate the tree of user
> namespaces on the system, one would use NS_GET_PARENT on each of
> the /proc/*/ns/user files and match up the results. Right?
>
> What happens if one of the parent user namespaces contains no
> processes? That is, the parent namespace exists by virtue of being
> pinned because a proc/PID/ns/user file is open or bind mounted.
> (Chrome seems to do this sort of dance with user namespaces, for
> example.) How do we find the ancestor of *that* user namespace?
What is returned from NS_GET_USERNS and NS_GET_PARENT is a file
descriptor, that you can call NS_GET_PARENT on.
Eric
Powered by blists - more mailing lists