[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1pmzwb7pd.fsf@fess.ebiederm.org>
Date: Thu, 18 Mar 2021 14:08:46 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@...hat.com>
Cc: qianli zhao <zhaoqianligood@...il.com>, christian@...uner.io,
axboe@...nel.dk, Thomas Gleixner <tglx@...utronix.de>,
Peter Collingbourne <pcc@...gle.com>,
linux-kernel@...r.kernel.org, Qianli Zhao <zhaoqianli@...omi.com>
Subject: Re: [PATCH V3] exit: trigger panic when global init has exited
Oleg Nesterov <oleg@...hat.com> writes:
> On 03/18, qianli zhao wrote:
>>
>> Hi,Oleg
>>
>> Thank you for your reply.
>>
>> >> When init sub-threads running on different CPUs exit at the same time,
>> >> zap_pid_ns_processe()->BUG() may be happened.
>>
>> > and why do you think your patch can't prevent this?
>>
>> > Sorry, I must have missed something. But it seems to me that you are trying
>> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
>> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
>>
>> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
>> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
>> being called when init do_exit().
>
> Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> before exit_signals() which sets PF_EXITING. Thanks for correcting me.
>
> So yes, I was wrong, your patch can prevent this. Although I'd like to
> recheck if every do-something-if-group-dead action is correct in the
> case we have a non-PF_EXITING thread...
>
> But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> when the global init exits?
>
>> In addition, the patch also protects the init process state to
>> successfully get usable init coredump.
>
> Could you spell please?
>
> Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> to panic earlier, before other init's sub-threads exit?
That is my understanding.
As I understand it this patch has two purposes:
1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
2. panic as early as possible so exiting threads don't removing
interesting debugging state.
It is a bit tricky to tell if the movement of the decrement of
signal->live is safe. That affects current_is_single threaded
which is used by unshare, setns of the time namespace, and setting
the selinux part of creds.
The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
Hmm, Maybe not. Today cgroup_thread_change_begin is held around
setting PF_EXITING before signal->live is decremented. So there seem to
be some subtle cgroup dependencies.
The usages of group_dead in do_exit seem safe, as except for the new
one everything is the same.
We could definitely take advantage of knowing group_dead in exit_signals
to simplify it's optimization to not rerouting signals to living
threads.
I think if we are going to move the decrement of signal->live that
should be it's own patch and be accompanied with a good description of
why it is safe instead of having the decrement of signal->live be there
as a side effect of another change.
Eric
Powered by blists - more mailing lists