lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jan 2017 04:24:52 +1300
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
        Lennart Poettering <lennart@...ttering.net>,
        Kay Sievers <kay.sievers@...y.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Cyrill Gorcunov <gorcunov@...nvz.org>,
        John Stultz <john.stultz@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Nicolas Pitre <nicolas.pitre@...aro.org>,
        Michal Hocko <mhocko@...e.com>,
        Stanislav Kinsburskiy <skinsbursky@...tuozzo.com>,
        Mateusz Guzik <mguzik@...hat.com>,
        linux-kernel@...r.kernel.org,
        Pavel Emelyanov <xemul@...tuozzo.com>,
        Konstantin Khorenko <khorenko@...tuozzo.com>
Subject: Re: setns() && PR_SET_CHILD_SUBREAPER

Oleg Nesterov <oleg@...hat.com> writes:

> On 01/24, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > Suppose we have a process P in the root namespace and another namespace X.
>> >
>> > P does setns() and enters the X namespace.
>> > P forks a child C.
>> >
>> > C forks a grandchild G.
>> > C exits.
>> >
>> > The question is, where should we reparent the grandchild G? In the normal
>> > case it will be reparented to X->child_reaper and this looks correct.
>> >
>> > But lets suppose that P runs with the ->has_child_subreaper bit set. In
>> > this case it will be reparented to P's sub-reaper or a global init, and
>> > given that P can't control its ->has_child_subreaper flag this does not
>> > look right to me.
>> >
>> > I can make a simple patch but perhaps I missed something or we actually
>> > want this (imo strange) behaviour?
>>
>> We definitely do not want a child to be repareted out of a pid namespace
>> when the pid namespace has a perfectly fine child_reaper.
>>
>> The special case for the init_task in find_new_reaper appears to be the
>> instance of this problem that was considered in the code.
>
> Actually we should blame the same_thread_group(reaper, child_reaper) check,
> it should had ensured we could not cross the namespaces, but it is not
> enough. Because this logic predates setns().
>
>> Semantically what we want to do is walk up the parents in the process
>> tree.  If a parent has is_child_subreaper we stop at it.  If the
>> transition from one parent to the next we are switching pid namespaces
>> we want the reaper from the pid namespace.
>
> Yes, this is what I have in mind, see the patch below. I need to re-check
> it and update the comment to explain why we can't simply check child_reaper
> as we currently do.
>
> This way we can start the search from father->real_parent, but the comment
> above the "reaper == &init_task" is no longer correct, we always need this
> check although perhaps is_idle_task(reaper) would be better.
>
>> As I recall has_child_subreaper was just supposed to be an optimization
>> so the common case would not have to walk up the process tree when
>> finding it's parent.
>
> Yep.
>
>> If we retain any optimizations such as has_child_subreaper please
>> consider the case where a process with is_child_subreaper set exits,
>> and what happens to it's children.
>
> Yes, in this case it should not have any effect. Well, there is another
> corner case, perhaps we should turn
>
> 		if (!reaper->signal->is_child_subreaper)
> 			continue;
>
> into
> 		if (!reaper->signal->is_child_subreaper) {
> 			if (!reaper->signal->has_child_subreaper)
> 				break;
> 			continue;
> 		}
>
> this looks a bit more correct if the exited "is_child_subreaper" process
> was forked, and after that its parent called prctl(SET_CHILD_SUBREAPER).
> But I think we do not care and Pavel is going to eliminate the case when
> a child of is_child_subreaper task can run without has_child_subreaper
> flag set.

As long as we update the flag when reparenting so that it is
accurate and we clear it when creating a child in a new pid namespace.

> So what do you think about the patch below?

That does look like the correct logic.

Whose tree do we want to run merge these fixes through?  I can pick them
up if that would be convinient.

Eric


> Oleg.
>
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -569,15 +569,15 @@ static struct task_struct *find_new_reaper(struct task_struct *father,
>  		return thread;
>  
>  	if (father->signal->has_child_subreaper) {
> +		unsigned int level = task_pid(father)->level;
>  		/*
>  		 * Find the first ->is_child_subreaper ancestor in our pid_ns.
> -		 * We start from father to ensure we can not look into another
> -		 * namespace, this is safe because all its threads are dead.
> +		 * We check pid->level, this is slightly more efficient than
> +		 * task_active_pid_ns(reaper) != task_active_pid_ns(father).
>  		 */
> -		for (reaper = father;
> -		     !same_thread_group(reaper, child_reaper);
> +		for (reaper = father->real_parent;
> +		     task_pid(reaper)->level == level;
>  		     reaper = reaper->real_parent) {
> -			/* call_usermodehelper() descendants need this check */
>  			if (reaper == &init_task)
>  				break;
>  			if (!reaper->signal->is_child_subreaper)

Powered by blists - more mailing lists