lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 12 May 2017 16:07:28 +0300 From: Kirill Tkhai <ktkhai@...tuozzo.com> To: <mhocko@...e.com>, <avagin@...nvz.org>, <peterz@...radead.org>, <oleg@...hat.com>, <linux-kernel@...r.kernel.org>, <rppt@...ux.vnet.ibm.com>, <ebiederm@...ssion.com>, <luto@...nel.org>, <gorcunov@...nvz.org>, <akpm@...ux-foundation.org>, <mingo@...nel.org>, <ktkhai@...tuozzo.com>, <serge@...lyn.com> Subject: [PATCH] pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() Imagine we have a pid namespace and a task from its parent's pid_ns, which made setns() to the pid namespace. The task is doing fork(), while the pid namespace's child reaper is dying. We have the race between them: Task from parent pid_ns Child reaper copy_process() .. alloc_pid() .. .. zap_pid_ns_processes() .. disable_pid_allocation() .. read_lock(&tasklist_lock) .. iterate over pids in pid_ns .. kill tasks linked to pids .. read_unlock(&tasklist_lock) write_lock_irq(&tasklist_lock); .. attach_pid(p, PIDTYPE_PID); .. .. .. So, just created task p won't receive SIGKILL signal, and the pid namespace will be in contradictory state. Only manual kill will help there, but does the userspace care about this? I suppose, the most users just inject a task into a pid namespace and wait a SIGCHLD from it. The patch fixes the problem. It moves disable_pid_allocation() into find_child_reaper() where tasklist_lock is held, and this allows to simply check for (pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process(). If allocation is disabled, we just return -ENOMEM like it's made for such cases in alloc_pid(). Signed-off-by: Kirill Tkhai <ktkhai@...tuozzo.com> CC: Andrew Morton <akpm@...ux-foundation.org> CC: Ingo Molnar <mingo@...nel.org> CC: Peter Zijlstra <peterz@...radead.org> CC: Oleg Nesterov <oleg@...hat.com> CC: Mike Rapoport <rppt@...ux.vnet.ibm.com> CC: Michal Hocko <mhocko@...e.com> CC: Andy Lutomirski <luto@...nel.org> CC: "Eric W. Biederman" <ebiederm@...ssion.com> CC: Andrei Vagin <avagin@...nvz.org> CC: Cyrill Gorcunov <gorcunov@...nvz.org> CC: Serge Hallyn <serge@...lyn.com> --- kernel/exit.c | 2 ++ kernel/fork.c | 15 ++++++++++----- kernel/pid_namespace.c | 3 --- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 516acdb0e0ec..9310e69fbc5f 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -586,6 +586,8 @@ static struct task_struct *find_child_reaper(struct task_struct *father) return reaper; } + /* Don't allow any more processes into the pid namespace */ + disable_pid_allocation(pid_ns); write_unlock_irq(&tasklist_lock); if (unlikely(pid_ns == &init_pid_ns)) { panic("Attempted to kill init! exitcode=0x%08x\n", diff --git a/kernel/fork.c b/kernel/fork.c index bfd91b180778..dbafabf6c7b1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1523,6 +1523,7 @@ static __latent_entropy struct task_struct *copy_process( unsigned long tls, int node) { + struct pid_namespace *pid_ns; int retval; struct task_struct *p; @@ -1735,8 +1736,9 @@ static __latent_entropy struct task_struct *copy_process( if (retval) goto bad_fork_cleanup_io; + pid_ns = p->nsproxy->pid_ns_for_children; if (pid != &init_struct_pid) { - pid = alloc_pid(p->nsproxy->pid_ns_for_children); + pid = alloc_pid(pid_ns); if (IS_ERR(pid)) { retval = PTR_ERR(pid); goto bad_fork_cleanup_thread; @@ -1845,10 +1847,11 @@ static __latent_entropy struct task_struct *copy_process( */ recalc_sigpending(); if (signal_pending(current)) { - spin_unlock(¤t->sighand->siglock); - write_unlock_irq(&tasklist_lock); retval = -ERESTARTNOINTR; - goto bad_fork_cancel_cgroup; + goto bad_fork_unlock_siglock; + } else if (unlikely(!(pid_ns->nr_hashed & PIDNS_HASH_ADDING))) { + retval = -ENOMEM; + goto bad_fork_unlock_siglock; } if (likely(p->pid)) { @@ -1906,7 +1909,9 @@ static __latent_entropy struct task_struct *copy_process( return p; -bad_fork_cancel_cgroup: +bad_fork_unlock_siglock: + spin_unlock(¤t->sighand->siglock); + write_unlock_irq(&tasklist_lock); cgroup_cancel_fork(p); bad_fork_free_pid: cgroup_threadgroup_change_end(current); diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index d1f3e9f558b8..aedf86a8017e 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -210,9 +210,6 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) struct task_struct *task, *me = current; int init_pids = thread_group_leader(me) ? 1 : 2; - /* Don't allow any more processes into the pid namespace */ - disable_pid_allocation(pid_ns); - /* * Ignore SIGCHLD causing any terminated children to autoreap. * This speeds up the namespace shutdown, plus see the comment
Powered by blists - more mailing lists