lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <69f13674-7f84-5dc7-0bd7-e5e65e9cb3b0@oracle.com>
Date:   Tue, 13 Mar 2018 14:20:16 -0700
From:   Nagarathnam Muthusamy <nagarathnam.muthusamy@...cle.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     kernel list <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        Nagarajan.Muthukrishnan@...cle.com,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Serge Hallyn <serge.hallyn@...ntu.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Eugene Syromiatnikov <esyr@...hat.com>, xemul@...allels.com
Subject: Re: [RESEND RFC] translate_pid API



On 03/13/2018 01:47 PM, Jann Horn wrote:
> On Mon, Mar 12, 2018 at 10:18 AM,  <nagarathnam.muthusamy@...cle.com> wrote:
>> Resending the RFC with participants of previous discussions
>> in the list.
>>
>> Following patch which is a variation of a solution discussed
>> in https://lwn.net/Articles/736330/ provides the users of
>> pid namespace, the functionality of pid translation between
>> namespaces using a namespace identifier. The topic of
>> pid translation has been discussed in the community few times
>> but there has always been a resistance to adding new solution
>> for this problem.
>> I will outline the planned usecase of pid namespace by oracle
>> database and explain why any of the existing solution cannot
>> be used to solve their problem.
>>
>> Consider a system in which several PID namespaces with multiple
>> nested levels exists in parallel with monitor processes managing
>> all the namespaces. PID translation is required for controlling
>> and accessing information about the processes by the monitors
>> and other processes down the hierarchy of namespaces. Controlling
>> primarily involves sending signals or using ptrace by a process in
>> parent namespace on any of the processes in its child namespace.
>> Accessing information deals with the reading /proc/<pid>/* files
>> of processes in child namespace. None of the processes have
>> root/CAP_SYS_ADMIN privileges.
> How are you dealing with PID reuse?

We have a monitor process which keeps track of the aliveness of
important processes. When a process dies, monitor makes a note of
it and hence detects if pid is reused.

>
> [...]
>> diff --git a/fs/nsfs.c b/fs/nsfs.c
>> index 36b0772..c635465 100644
>> --- a/fs/nsfs.c
>> +++ b/fs/nsfs.c
>> @@ -222,8 +222,13 @@ int ns_get_name(char *buf, size_t size, struct task_struct *task,
>>          const char *name;
>>          ns = ns_ops->get(task);
>>          if (ns) {
>> -               name = ns_ops->real_ns_name ? : ns_ops->name;
>> -               res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
>> +               if (!strcmp(ns_ops->name, "pidns_id")) {
> Wouldn't it be cleaner to check for "ns_ops==&pidns_id_operations"?

Yup. Will fix it.

>
>> +                       res = snprintf(buf, size, "[%llu]",
>> +                                      (unsigned long long)ns->ns_id);
>> +               } else {
>> +                       name = ns_ops->real_ns_name ? : ns_ops->name;
>> +                       res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
>> +               }
>>                  ns_ops->put(ns);
>>          }
>>          return res;
> [...]
>> diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
>> index 49538b1..11d1d57 100644
>> --- a/include/linux/pid_namespace.h
>> +++ b/include/linux/pid_namespace.h
>> @@ -11,6 +11,7 @@
>>   #include <linux/kref.h>
>>   #include <linux/ns_common.h>
>>   #include <linux/idr.h>
>> +#include <linux/list_bl.h>
>>
>>
>>   struct fs_pin;
>> @@ -44,6 +45,8 @@ struct pid_namespace {
>>          kgid_t pid_gid;
>>          int hide_pid;
>>          int reboot;     /* group exit code if this pidns was rebooted */
>> +       struct hlist_bl_node node;
>> +       atomic_t lookups_pending;
>>          struct ns_common ns;
>>   } __randomize_layout;
>>
> [...]
>> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
>> index 0b53eef..ff83aa8 100644
>> --- a/kernel/pid_namespace.c
>> +++ b/kernel/pid_namespace.c
> [...]
>> @@ -159,6 +201,30 @@ static void delayed_free_pidns(struct rcu_head *p)
>>
>>   static void destroy_pid_namespace(struct pid_namespace *ns)
>>   {
>> +       struct pid_namespace *ph;
>> +       struct hlist_bl_head *head;
>> +       struct hlist_bl_node *dup_node;
>> +
>> +       /*
>> +        * Remove the namespace structure from hash table so
>> +        * now new lookups can start on it.
> s/now new/no new/

Will fix it.

>
> [...]
>> @@ -474,9 +551,116 @@ static struct user_namespace *pidns_owner(struct ns_common *ns)
>>          .get_parent     = pidns_get_parent,
>>   };
>>
>> +/*
>> + * translate_pid - convert pid in source pid-ns into target pid-ns.
>> + * @pid: pid for translation
>> + * @source: pid-ns id
>> + * @target: pid-ns id
>> + *
>> + * Return pid in @target pid-ns, zero if task have no pid there,
>> + * or -ESRCH of task with @pid is not found in @source pid-ns.
> s/of/if/

Will fix it.

>
>> + */
>> +SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source,
>> +               u64, target)
>> +{
>> +       struct pid_namespace *source_ns = NULL, *target_ns = NULL;
>> +       struct pid *struct_pid;
>> +       struct pid_namespace *ph;
>> +       struct hlist_bl_head *shead = NULL;
>> +       struct hlist_bl_head *thead = NULL;
>> +       struct hlist_bl_node *dup_node;
>> +       pid_t result;
>> +
>> +       if (!source) {
>> +               source_ns = &init_pid_ns;
>> +       } else {
>> +               shead = pid_ns_hash_head(pid_ns_hash, source);
>> +               hlist_bl_lock(shead);
>> +               hlist_bl_for_each_entry(ph, dup_node, shead, node) {
>> +                       if (source == ph->ns.ns_id) {
>> +                               source_ns = ph;
>> +                               break;
>> +                       }
>> +               }
>> +               if (!source_ns) {
>> +                       hlist_bl_unlock(shead);
>> +                       return -EINVAL;
>> +               }
>> +       }
>> +       if (!ptrace_may_access(source_ns->child_reaper,
>> +                              PTRACE_MODE_READ_FSCREDS)) {
> AFAICS this proposal breaks the visibility restrictions that
> namespaces normally create. If there are two namespaces-based
> containers that use the same UID range, I don't think they should be
> able to learn information about each other, such as which PIDs are in
> use in the other container; but as far as I can tell, your proposal
> makes it possible to do that (unless an LSM or so is interfering). I
> would prefer it if this API required visibility of the targeted PID
> namespaces in the caller's PID namespace.

I am trying to simulate the same access restrictions allowed
on a process's /proc/<pid>/ns/pid file. If the translator has
access to /proc/<pid>/ns/pid file of both source and destination
namespaces, shouldn't it be allowed to translate the pid between
them?

>
> When doing ptrace access checks, please use the real creds in syscalls
> like this one, not the fs creds. The fs creds are for filesystem
> syscalls (in particular sys_open()), not for specialized syscalls like
> ptrace() or this one.
Will fix this.

Thanks,
Nagarathnam.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ