lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 11 Dec 2016 12:54:56 +0100
From:   "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:     Andrei Vagin <avagin@...nvz.org>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Containers <containers@...ts.linux-foundation.org>,
        Linux API <linux-api@...r.kernel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        James Bottomley <James.Bottomley@...senpartnership.com>,
        "W. Trevor King" <wking@...mily.us>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Serge Hallyn <serge.hallyn@...onical.com>,
        Michael Kerrisk <mtk.manpages@...il.com>
Subject: Documenting the ioctl interfaces to discover relationships between namespaces

[was: [PATCH 0/4 v3] Add an interface to discover relationships
between namespaces]

Hello Andrei

See below for my attempt to document the following.

On 6 September 2016 at 09:47, Andrei Vagin <avagin@...nvz.org> wrote:
> From: Andrey Vagin <avagin@...nvz.org>
>
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running
> system.  Another would be to answer the question: what capability does
> process X have to perform operations on a resource governed by namespace
> Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we are going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> $ man man7/namespaces.7
> ...
> Since  Linux  4.X,  the  following  ioctl(2)  calls are supported for
> namespace file descriptors.  The correct syntax is:
>
>       fd = ioctl(ns_fd, ioctl_type);
>
> where ioctl_type is one of the following:
>
> NS_GET_USERNS
>       Returns a file descriptor that refers to an owning user names‐
>       pace.
>
> NS_GET_PARENT
>       Returns  a  file descriptor that refers to a parent namespace.
>       This ioctl(2) can be used for pid  and  user  namespaces.  For
>       user namespaces, NS_GET_PARENT and NS_GET_USERNS have the same
>       meaning.
>
> In addition to generic ioctl(2) errors, the following  specific  ones
> can occur:
>
> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>
> EPERM  The  requested  namespace  is outside of the current namespace
>       scope.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101

The following is the text I propose to add to the namespaces(7) page.
Could you please review and let me know of corrections and
improvements.

Thanks,

Michael


   Introspecting namespace relationships
       Since Linux 4.9, two ioctl(2) operations  are  provided  to  allow
       introspection  of  namespace relationships (see user_namespaces(7)
       and pid_namespaces(7)).  The form of the calls is:

           ioctl(fd, request);

       In each case, fd refers to a /proc/[pid]/ns/* file.

       NS_GET_USERNS
              Returns a file descriptor that refers to  the  owning  user
              namespace for the namespace referred to by fd.

       NS_GET_PARENT
              Returns  a file descriptor that refers to the parent names‐
              pace of the namespace referred to by fd.  This operation is
              valid  only for hierarchical namespaces (i.e., PID and user
              namespaces).  For user namespaces, NS_GET_PARENT is synony‐
              mous with NS_GET_USERNS.

       In each case, the returned file descriptor is opened with O_RDONLY
       and O_CLOEXEC (close-on-exec).

       By applying fstat(2) to the returned file descriptor, one  obtains
       a  stat structure whose st_ino (inode number) field identifies the
       owning/parent namespace.  This inode number can  be  matched  with
       the  inode  number  of  another  /proc/[pid]/ns/{pid,user} file to
       determine whether that is the owning/parent namespace.

       Either of these ioctl(2) operations can fail  with  the  following
       error:

       EPERM  The  requested  namespace is outside of the caller's names‐
              pace scope.  This error can occur if, for example, the own‐
              ing  user  namespace is an ancestor of the caller's current
              user namespace.  It can also occur on  attempts  to  obtain
              the parent of the initial user or PID namespace.

       Additionally,  the  NS_GET_PARENT operation can fail with the fol‐
       lowing error:

       EINVAL fd refers to a nonhierarchical namespace.

       See the EXAMPLE section for an example of the use of these  opera‐
       tions.

   [...]

EXAMPLE
       The  example  shown  below  uses the ioctl(2) operations described
       above to perform simple introspection of namespace  relationships.
       The  following  shell sessions show various examples of the use of
       this program.

       Trying to get the parent of the initial user namespace fails,  for
       the reasons explained earlier:

           $ ./ns_introspect /proc/self/ns/user p
           The parent namespace is outside your namespace scope

       Create a process running sleep(1) that resides in new user and UTS
       namespaces, and show that new UTS namespace is associated with the
       new user namespace:

           $ unshare -Uu sleep 1000 &
           [1] 23235
           $ ./ns_introspect /proc/23235/ns/uts
           Inode number of owning user namespace is: 4026532448
           $ readlink /proc/23235/ns/user
           user:[4026532448]

       Then show that the parent of the new user namespace in the preced‐
       ing example is the initial user namespace:

           $ readlink /proc/self/ns/user
           user:[4026531837]
           $ ./ns_introspect /proc/23235/ns/user
           Inode number of owning user namespace is: 4026531837

       Start a shell in a new user namespace, and show that  from  within
       this  shell, the parent user namespace can't be discovered.  Simi‐
       larly, the UTS namespace (which is  associated  with  the  initial
       user namespace) can't be discovered.

           $ PS1="sh2$ " unshare -U bash
           sh2$ ./ns_introspect /proc/self/ns/user p
           The parent namespace is outside your namespace scope
           sh2$ ./ns_introspect /proc/self/ns/uts u
           The owning user namespace is outside your namespace scope

   Program source

       /* ns_introspect.c

          Licensed under GNU General Public License v2 or later
       */
       #include <stdlib.h>
       #include <unistd.h>
       #include <stdio.h>
       #include <sys/stat.h>
       #include <fcntl.h>
       #include <sys/ioctl.h>
       #include <string.h>
       #include <errno.h>

       #ifndef NS_GET_USERNS
       #define NSIO    0xb7
       #define NS_GET_USERNS   _IO(NSIO, 0x1)
       #define NS_GET_PARENT   _IO(NSIO, 0x2)
       #endif

       int
       main(int argc, char *argv[])
       {
           int fd, userns_fd, parent_fd;
           struct stat sb;

           if (argc < 2) {
               fprintf(stderr, "Usage: %s /proc/[pid]/ns/[file] [p|u]\n",
                       argv[0]);
               fprintf(stderr, "\nDisplay the result of one or both "
                       "of NS_GET_USERNS (u) or NS_GET_PARENT (p)\n"
                       "for the specified /proc/[pid]/ns/[file]. If neither "
                       "'p' nor 'u' is specified,\n"
                       "NS_GET_USERNS is the default.\n");
               exit(EXIT_FAILURE);
           }

           /* Obtain a file descriptor for the 'ns' file specified
              in argv[1] */

           fd = open(argv[1], O_RDONLY);
           if (fd == -1) {
               perror("open");
               exit(EXIT_FAILURE);
           }

           /* Obtain a file descriptor for the owning user namespace and
              then obtain and display the inode number of that namespace */

           if (argc < 3 || strchr(argv[2], 'u')) {
               userns_fd = ioctl(fd, NS_GET_USERNS);

               if (userns_fd == -1) {
                   if (errno == EPERM)
                       printf("The owning user namespace is outside "
                               "your namespace scope\n");
                   else
                      perror("ioctl-NS_GET_USERNS");
                   exit(EXIT_FAILURE);
                }

               if (fstat(userns_fd, &sb) == -1) {
                   perror("fstat-userns");
                   exit(EXIT_FAILURE);
               }
               printf("Inode number of owning user namespace is: %ld\n",
                       (long) sb.st_ino);

               close(userns_fd);
           }

           /* Obtain a file descriptor for the parent namespace and
              then obtain and display the inode number of that namespace */

           if (argc > 2 && strchr(argv[2], 'p')) {
               parent_fd = ioctl(fd, NS_GET_PARENT);

               if (parent_fd == -1) {
                   if (errno == EINVAL)
                       printf("Can' get parent namespace of a "
                               "nonhierarchical namespace\n");
                   else if (errno == EPERM)
                       printf("The parent namespace is outside "
                               "your namespace scope\n");
                   else
                       perror("ioctl-NS_GET_PARENT");
                   exit(EXIT_FAILURE);
               }

               if (fstat(parent_fd, &sb) == -1) {
                   perror("fstat-parentns");
                   exit(EXIT_FAILURE);
               }
               printf("Inode number of parent namespace is: %ld\n",
                       (long) sb.st_ino);

               close(parent_fd);
           }

           exit(EXIT_SUCCESS);
       }


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Powered by blists - more mailing lists