[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPx_LQGBJGgZ+zzhJ2U4RpoPKt3hvf8LRfACtj2LPD7senub7A@mail.gmail.com>
Date: Mon, 22 Mar 2021 00:00:48 +0800
From: qianli zhao <zhaoqianligood@...il.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Oleg Nesterov <oleg@...hat.com>, christian@...uner.io,
axboe@...nel.dk, Thomas Gleixner <tglx@...utronix.de>,
Peter Collingbourne <pcc@...gle.com>,
linux-kernel@...r.kernel.org, Qianli Zhao <zhaoqianli@...omi.com>
Subject: Re: [PATCH V3] exit: trigger panic when global init has exited
Hi,Eric,Oleg
> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe
It is hard to say whether it is safe or not,below are just a few of my thoughts
> That affects current_is_single threaded
> which is used by unshare, setns of the time namespace, and setting
> the selinux part of creds.
I think is ok in current_is_single_threaded,in check_unshare_flags()
and selinux_setprocattr()
change "signal->live--" position won't change the result,There is no
other dependency change to this patch and these function.
> The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
> Hmm, Maybe not. Today cgroup_thread_change_begin is held around
> setting PF_EXITING before signal->live is decremented. So there seem to
> be some subtle cgroup dependencies.
Moving the decrement position should only affect between new and old
code position of movement of the decrement of
signal->live.In this range,i think
acct_update_integrals(),sync_mm_rss() mainly updated some data,only
exit_signals() and sched_exit() need
attention.
cgroup_threadgroup_change_begin() is called in exit_signals(),and
css_task_iter_advance used "signal->live",it seems like it might be a
little related.
cgroup_threadgroup_change_begin() just give stable threadgroup for
user,and css_task_iter_advance only check if group is dead, decrement
of
signal->live and sets PF_EXITING seems like safe.
> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description of
> why it is safe instead of having the decrement of signal->live be there
> as a side effect of another change.
I'm not sure how to describe whether this move is safe or not,from my
analysis, no side effects have been found.
Would you like tell me how to prove that or give me some suggestion?
Thanks
Eric W. Biederman <ebiederm@...ssion.com> 于2021年3月19日周五 上午3:09写道:
>
> Oleg Nesterov <oleg@...hat.com> writes:
>
> > On 03/18, qianli zhao wrote:
> >>
> >> Hi,Oleg
> >>
> >> Thank you for your reply.
> >>
> >> >> When init sub-threads running on different CPUs exit at the same time,
> >> >> zap_pid_ns_processe()->BUG() may be happened.
> >>
> >> > and why do you think your patch can't prevent this?
> >>
> >> > Sorry, I must have missed something. But it seems to me that you are trying
> >> > to fix the wrong problem. Yes, zap_pid_ns_processes() must not be called in
> >> > the root namespace, and this has nothing to do with CONFIG_PID_NS.
> >>
> >> Yes, i try to fix this exception by test SIGNAL_GROUP_EXIT and call
> >> panic before setting PF_EXITING to prevent zap_pid_ns_processes()
> >> being called when init do_exit().
> >
> > Ah, I didn't notice your patch does atomic_dec_and_test(signal->live)
> > before exit_signals() which sets PF_EXITING. Thanks for correcting me.
> >
> > So yes, I was wrong, your patch can prevent this. Although I'd like to
> > recheck if every do-something-if-group-dead action is correct in the
> > case we have a non-PF_EXITING thread...
> >
> > But then I don't understand the SIGNAL_GROUP_EXIT check added by your
> > patch. Do we really need it if we want to avoid zap_pid_ns_processes()
> > when the global init exits?
> >
> >> In addition, the patch also protects the init process state to
> >> successfully get usable init coredump.
> >
> > Could you spell please?
> >
> > Does this connect to SIGNAL_GROUP_EXIT check? Do you mean that you want
> > to panic earlier, before other init's sub-threads exit?
>
> That is my understanding.
>
> As I understand it this patch has two purposes:
> 1. Avoid the BUG_ON in zap_pid_ns_processes when !CONFIG_PID_NS
> 2. panic as early as possible so exiting threads don't removing
> interesting debugging state.
>
>
> It is a bit tricky to tell if the movement of the decrement of
> signal->live is safe. That affects current_is_single threaded
> which is used by unshare, setns of the time namespace, and setting
> the selinux part of creds.
>
> The usage in kernel/cgroup/cgroup.c:css_task_iter_advance seems safe.
> Hmm, Maybe not. Today cgroup_thread_change_begin is held around
> setting PF_EXITING before signal->live is decremented. So there seem to
> be some subtle cgroup dependencies.
>
> The usages of group_dead in do_exit seem safe, as except for the new
> one everything is the same.
>
> We could definitely take advantage of knowing group_dead in exit_signals
> to simplify it's optimization to not rerouting signals to living
> threads.
>
>
> I think if we are going to move the decrement of signal->live that
> should be it's own patch and be accompanied with a good description of
> why it is safe instead of having the decrement of signal->live be there
> as a side effect of another change.
>
> Eric
Powered by blists - more mailing lists