lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 17 May 2012 15:46:53 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Pavel Emelyanov <xemul@...allels.com>,
	Cyrill Gorcunov <gorcunov@...nvz.org>,
	Louis Rilling <louis.rilling@...labs.com>,
	Mike Galbraith <efault@....de>
Subject: Re: [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped.

Oleg Nesterov <oleg@...hat.com> writes:

> On 05/16, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > Hmm. I don't think the patch is 100% correct. Afaics, this needs more
>> > delay_pidns_leader() checks.
>> >
>> > For example. Suppose we have a CLONE_NEWPID zombie I, it has an
>> > EXIT_DEAD child D so delay_pidns_leader(I) == T.
>> >
>> > Now suppose that I->real_parent exits, lets denote this task as P.
>> >
>> > Suppose that P->real_parent ignores SIGCHLD.
>> >
>> > In this case P will do release_task(I) prematurely. And worse, when
>> > D finally does realease_task(D) it will do realease_task(I) again.
>>
>> Good point.  I will fix that and post a patch shortly.  It doesn't
>> need a full delay_pidns_leader test just a test for children.
>
> This will add more complications. And even this is not enough, I guess.
> For example __ptrace_detach()...

Agreed.  I am having to step back and think about this a bit more.

I don't like doing things two different ways but delay_thread_group
leader and all of that is pretty horrible from a maintenance point
of view and extending that just makes things worse.

> I agree, the idea to "hack" release_task() so that it switches to
> init is clever, but imho this is too clever ;)
>
> Seriously, what do you think about the patch below? Or something
> like this. It is still based on your suggestion to check ->children,
> but it is much, much more simple and understandable.
>
> Just in case... Even with the PF_EXITING check __wake_up_parent()
> can be wrong, but this is very unlikely and harmless.
>
> What do you think?

I think there is something very compelling about your solution,
we do need my bit about making the init process ignore SIGCHLD
so all of init's children self reap.

Before I go farther I am going to play with the code more.

In part I think the current code for waiting for processes to
die etc is pretty horrible maintenance wise and it might just
be worth cleaning up before we extending it with yet another
strange and bizarre case, if for no other reason than to make
it clear what we are doing.


>> In looking for any other weird corner case bugs I am noticing that
>> I don't think I handled the case of a ptraced init quite right.
>> I don't understand the change signaling semantics when the
>> ptracer is our parent.
>
> Do you mean the "if (tsk->ptrace)" code in exit_notify() ? Nobody
> understand it ;) Last time this code was modified by me (iirc), but
> I simply tried to preserve the previous behaviour.

Yes.  It is some pretty strange code.  Especially where we are reading
a return result which is always false.  I think there is a bug somewhere
between that code and ptrace detach but I don't know that I could tell
you what it is.

Hopefully I have a follow-on patch in another couple of hours.

Eric


> Oleg.
>
> --- x/kernel/exit.c
> +++ x/kernel/exit.c
> @@ -63,6 +63,13 @@ static void exit_mm(struct task_struct *
>  
>  static void __unhash_process(struct task_struct *p, bool group_dead)
>  {
> +	struct task_struct *parent = p->parent;
> +	bool parent_is_init = false;
> +
> +#ifdef CONFIG_PID_NS
> +	parent_is_init = (task_active_pid_ns(p)->child_reaper == parent);
> +#endif
> +
>  	nr_threads--;
>  	detach_pid(p, PIDTYPE_PID);
>  	if (group_dead) {
> @@ -72,6 +79,11 @@ static void __unhash_process(struct task
>  		list_del_rcu(&p->tasks);
>  		list_del_init(&p->sibling);
>  		__this_cpu_dec(process_counts);
> +
> +		if (parent_is_init && (parent->flags & PF_EXITING)) {
> +			if (list_empty(&parent->children))
> +				__wake_up_parent(p, parent);
> +		}
>  	}
>  	list_del_rcu(&p->thread_group); 
>  }
> --- x/kernel/pid_namespace.c
> +++ x/kernel/pid_namespace.c
> @@ -184,6 +184,9 @@ void zap_pid_ns_processes(struct pid_nam
>  		rc = sys_wait4(-1, NULL, __WALL, NULL);
>  	} while (rc != -ECHILD);
>  
> +	wait_event(&current->signal->wait_chldexit,
> +			list_empty(&current->children));
> +
>  	if (pid_ns->reboot)
>  		current->signal->group_exit_code = pid_ns->reboot;
>  
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ