[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120517170015.GA12436@redhat.com>
Date: Thu, 17 May 2012 19:00:15 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Pavel Emelyanov <xemul@...allels.com>,
Cyrill Gorcunov <gorcunov@...nvz.org>,
Louis Rilling <louis.rilling@...labs.com>,
Mike Galbraith <efault@....de>
Subject: Re: [PATCH 2/3] pidns: Guarantee that the pidns init will be the
last pidns process reaped.
On 05/16, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@...hat.com> writes:
>
> > Hmm. I don't think the patch is 100% correct. Afaics, this needs more
> > delay_pidns_leader() checks.
> >
> > For example. Suppose we have a CLONE_NEWPID zombie I, it has an
> > EXIT_DEAD child D so delay_pidns_leader(I) == T.
> >
> > Now suppose that I->real_parent exits, lets denote this task as P.
> >
> > Suppose that P->real_parent ignores SIGCHLD.
> >
> > In this case P will do release_task(I) prematurely. And worse, when
> > D finally does realease_task(D) it will do realease_task(I) again.
>
> Good point. I will fix that and post a patch shortly. It doesn't
> need a full delay_pidns_leader test just a test for children.
This will add more complications. And even this is not enough, I guess.
For example __ptrace_detach()...
I agree, the idea to "hack" release_task() so that it switches to
init is clever, but imho this is too clever ;)
Seriously, what do you think about the patch below? Or something
like this. It is still based on your suggestion to check ->children,
but it is much, much more simple and understandable.
Just in case... Even with the PF_EXITING check __wake_up_parent()
can be wrong, but this is very unlikely and harmless.
What do you think?
> In looking for any other weird corner case bugs I am noticing that
> I don't think I handled the case of a ptraced init quite right.
> I don't understand the change signaling semantics when the
> ptracer is our parent.
Do you mean the "if (tsk->ptrace)" code in exit_notify() ? Nobody
understand it ;) Last time this code was modified by me (iirc), but
I simply tried to preserve the previous behaviour.
Oleg.
--- x/kernel/exit.c
+++ x/kernel/exit.c
@@ -63,6 +63,13 @@ static void exit_mm(struct task_struct *
static void __unhash_process(struct task_struct *p, bool group_dead)
{
+ struct task_struct *parent = p->parent;
+ bool parent_is_init = false;
+
+#ifdef CONFIG_PID_NS
+ parent_is_init = (task_active_pid_ns(p)->child_reaper == parent);
+#endif
+
nr_threads--;
detach_pid(p, PIDTYPE_PID);
if (group_dead) {
@@ -72,6 +79,11 @@ static void __unhash_process(struct task
list_del_rcu(&p->tasks);
list_del_init(&p->sibling);
__this_cpu_dec(process_counts);
+
+ if (parent_is_init && (parent->flags & PF_EXITING)) {
+ if (list_empty(&parent->children))
+ __wake_up_parent(p, parent);
+ }
}
list_del_rcu(&p->thread_group);
}
--- x/kernel/pid_namespace.c
+++ x/kernel/pid_namespace.c
@@ -184,6 +184,9 @@ void zap_pid_ns_processes(struct pid_nam
rc = sys_wait4(-1, NULL, __WALL, NULL);
} while (rc != -ECHILD);
+ wait_event(¤t->signal->wait_chldexit,
+ list_empty(¤t->children));
+
if (pid_ns->reboot)
current->signal->group_exit_code = pid_ns->reboot;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists