linux-kernel - Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1lfavufep.fsf@fess.ebiederm.org>
Date:   Wed, 10 Mar 2021 10:44:46 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     qianli zhao <zhaoqianligood@...il.com>
Cc:     Oleg Nesterov <oleg@...hat.com>, christian@...uner.io,
        axboe@...nel.dk, Thomas Gleixner <tglx@...utronix.de>,
        Peter Collingbourne <pcc@...gle.com>,
        linux-kernel@...r.kernel.org, Qianli Zhao <zhaoqianli@...omi.com>
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

qianli zhao <zhaoqianligood@...il.com> writes:

> Hi,Oleg
>
> Thanks for your replay.
>
>> To be honest, I don't understand the changelog. It seems that you want
>> to uglify the kernel to simplify the debugging of buggy init? Or what?
>
> My patch is for the following purpose:
> 1. I hope to fix the occurrence of unexpected panic
> - [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
> This problem occurs when both two init threads enter the do_exit,
> One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> The other init thread perform ret_to_user()->get_signal() and found
> SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit()
> Since there are no alive init threads it finally goes to
> zap_pid_ns_processes() and BUG().
> Timing details are in the changelog.

At the start of your changelog and your patch subject you describe what
you are doing but not why.  For the next revision of the patch please
lead with the why it makes what you are trying to do much easier to
understand.

> 2. I hope to fix the problem that coredump cannot be parsed after init
> crash Before my patch,ever init sub-thread will finish do_exit(),the last
> thread will trigger panic().
> Except for the thread that triggered the panic,the state(such as
> PF_EXITING etc) of the other threads is already exiting,so we can't
> parse coredump from fulldump
> In fact, when any one init has been set to SIGNAL_GROUP_EXIT, then we
> can trigger panic,and prevent other init threads from continuing to
> exit
>
>> Nor can I understand the patch. I fail to understand the games with
>> SIGNAL_UNKILLABLE and ->siglock.
>
> When first init thread panic,i don't want other init threads to still
> exit,this will destroy the state of other init threads.
> so i use SIGNAL_UNKILLABLE to mark this stat,prevent other init
> threads from continuing to exit
> In addition i use siglock to protect tsk->signal->flags.

It does not work to use SIGNAL_UNKILLABLE for this.  Normally init
has SIGNAL_UNKILLABLE set.  The only case that clears SIGNAL_UNKILLABLE
is force_sig_info_to_task.  If the init process exits with exit(2)
SIGNAL_UNKILLABLE will already be set.  Which means testing
SIGNAL_UNKILLABLE as your patch does will prevent the panic.

Further simply calling panic is sufficient to guarantee that the other
threads don't exit, and that whichever thread calls panic first
will be the reporting thread.  The rest of the threads will be caught
in panic_smp_self_stop(), if they happen to be running on other cpus.

So I would make the whole thing just be:

	/* If global init has exited,
         * panic immediately to get a useable coredump.
         */
	if (unlikely(is_global_init(tsk) &&
	    (thread_group_empty(tsk) ||
            (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
		panic("Attempted to kill init!	exitcode=0x%08x\n",
                	tsk->signal->group_exit_code ?: (int)code);
	}

The thread_group_empty test is needed to handle single threaded
inits.

Do you think you can respin your patch as something like that?

Eric

>
>> And iiuc with this patch the kernel will crash if init's sub-thread execs,
>> signal_group_exit() returns T in this case.
>
> Oleg Nesterov <oleg@...hat.com> 于2021年3月10日周三 上午2:27写道：
>>
>> On 03/09, Qianli Zhao wrote:
>> >
>> > From: Qianli Zhao <zhaoqianli@...omi.com>
>> >
>> > Once any init thread finds SIGNAL_GROUP_EXIT, trigger panic immediately
>> > instead of last thread of global init has exited, and do not allow other
>> > init threads to exit, protect task/memory state of all sub-threads for
>> > get reliable init coredump
>>
>> To be honest, I don't understand the changelog. It seems that you want
>> to uglify the kernel to simplify the debugging of buggy init? Or what?
>>
>> Nor can I understand the patch. I fail to understand the games with
>> SIGNAL_UNKILLABLE and ->siglock.
>>
>> And iiuc with this patch the kernel will crash if init's sub-thread execs,
>> signal_group_exit() returns T in this case.
>>
>> Oleg.
>>
>> > [   24.705376] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>> > [   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S         O    4.14.180-perf-g4483caa8ae80-dirty #1
>> > [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>> >
>> > PID: 552   CPU: 4   COMMAND: "init"
>> > PID: 1     CPU: 7   COMMAND: "init"
>> > core4                         core7
>> > ...                           sys_exit_group()
>> >                               do_group_exit()
>> >                                   - sig->flags = SIGNAL_GROUP_EXIT
>> >                                   - zap_other_threads()
>> >                               do_exit()
>> >                                   - PF_EXITING is set
>> > ret_to_user()
>> > do_notify_resume()
>> > get_signal()
>> >     - signal_group_exit
>> >     - goto fatal;
>> > do_group_exit()
>> > do_exit()
>> >     - PF_EXITING is set
>> >     - panic("Attempted to kill init! exitcode=0x%08x\n")
>> >                               exit_notify()
>> >                               find_alive_thread() //no alive sub-threads
>> >                               zap_pid_ns_processes()//CONFIG_PID_NS is not set
>> >                               BUG()
>> >
>> > Signed-off-by: Qianli Zhao <zhaoqianli@...omi.com>
>> > ---
>> > We got an init crash issue, but we can't get init coredump from fulldump, we also
>> > see BUG() triggered which calling in zap_pid_ns_processes().
>> >
>> > From crash dump we can get the following information:
>> > 1. "Attempted to kill init",init process is killed.
>> > - Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>> > 2. At the same time as init crash, a BUG() triggered in other core.
>> > - [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>> > 3. When init thread calls exit_mm, the corresponding thread task->mm will be empty, which is not conducive to extracting coredump
>> >
>> > To fix the issue and save complete coredump, once find init thread is set to SIGNAL_GROUP_EXIT
>> > trigger panic immediately,and other child threads are not allowed to exit just wait for reboot
>> >
>> > PID: 1      TASK: ffffffc973126900  CPU: 7   COMMAND: "init"
>> >  #0 [ffffff800805ba60] perf_trace_kernel_panic_late at ffffff99ac0bcbcc
>> >  #1 [ffffff800805bac0] die at ffffff99ac08dc64
>> >  #2 [ffffff800805bb10] bug_handler at ffffff99ac08e398
>> >  #3 [ffffff800805bbc0] brk_handler at ffffff99ac08529c
>> >  #4 [ffffff800805bc80] do_debug_exception at ffffff99ac0814e4
>> >  #5 [ffffff800805bdf0] el1_dbg at ffffff99ac083298
>> > ->Exception
>> >     /home/work/courbet-r-stable-build/kernel/msm-4.14/include/linux/pid_namespace.h: 98
>> >  #6 [ffffff800805be20] do_exit at ffffff99ac0c22e8
>> >  #7 [ffffff800805be80] do_group_exit at ffffff99ac0c2658
>> >  #8 [ffffff800805beb0] sys_exit_group at ffffff99ac0c266c
>> >  #9 [ffffff800805bff0] el0_svc_naked at ffffff99ac083cf
>> > ->SYSCALLNO: 5e (__NR_exit_group)
>> >
>> > PID: 552    TASK: ffffffc9613c8f00  CPU: 4   COMMAND: "init"
>> >  #0 [ffffff801455b870] __delay at ffffff99ad32cc14
>> >  #1 [ffffff801455b8b0] __const_udelay at ffffff99ad32cd10
>> >  #2 [ffffff801455b8c0] msm_trigger_wdog_bite at ffffff99ac5d5be0
>> >  #3 [ffffff801455b980] do_msm_restart at ffffff99acccc3f8
>> >  #4 [ffffff801455b9b0] machine_restart at ffffff99ac085dd0
>> >  #5 [ffffff801455b9d0] emergency_restart at ffffff99ac0eb6dc
>> >  #6 [ffffff801455baf0] panic at ffffff99ac0bd008
>> >  #7 [ffffff801455bb70] do_exit at ffffff99ac0c257c
>> >     /home/work/courbet-r-stable-build/kernel/msm-4.14/kernel/exit.c: 842
>> >  #8 [ffffff801455bbd0] do_group_exit at ffffff99ac0c2644
>> >  #9 [ffffff801455bcc0] get_signal at ffffff99ac0d1384
>> > #10 [ffffff801455be60] do_notify_resume at ffffff99ac08b2a8
>> > #11 [ffffff801455bff0] work_pending at ffffff99ac083b8c
>> >
>> > ---
>> >  kernel/exit.c | 29 +++++++++++++++++++++--------
>> >  1 file changed, 21 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/kernel/exit.c b/kernel/exit.c
>> > index ef2fb929..6b2da22 100644
>> > --- a/kernel/exit.c
>> > +++ b/kernel/exit.c
>> > @@ -758,6 +758,27 @@ void __noreturn do_exit(long code)
>> >       validate_creds_for_do_exit(tsk);
>> >
>> >       /*
>> > +      * Once init group is marked for death,
>> > +      * panic immediately to get a useable coredump
>> > +      */
>> > +     if (unlikely(is_global_init(tsk) &&
>> > +         signal_group_exit(tsk->signal))) {
>> > +             spin_lock_irq(&tsk->sighand->siglock);
>> > +             if (!(tsk->signal->flags & SIGNAL_UNKILLABLE)) {
>> > +                     tsk->signal->flags |= SIGNAL_UNKILLABLE;
>> > +                     spin_unlock_irq(&tsk->sighand->siglock);
>> > +                     panic("Attempted to kill init! exitcode=0x%08x\n",
>> > +                             tsk->signal->group_exit_code ?: (int)code);
>> > +             } else {
>> > +                     /* init sub-thread is dying, just wait for reboot */
>> > +                     spin_unlock_irq(&tsk->sighand->siglock);
>> > +                     futex_exit_recursive(tsk);
>> > +                     set_current_state(TASK_UNINTERRUPTIBLE);
>> > +                     schedule();
>> > +             }
>> > +     }
>> > +
>> > +     /*
>> >        * We're taking recursive faults here in do_exit. Safest is to just
>> >        * leave this task alone and wait for reboot.
>> >        */
>> > @@ -776,14 +797,6 @@ void __noreturn do_exit(long code)
>> >       acct_update_integrals(tsk);
>> >       group_dead = atomic_dec_and_test(&tsk->signal->live);
>> >       if (group_dead) {
>> > -             /*
>> > -              * If the last thread of global init has exited, panic
>> > -              * immediately to get a useable coredump.
>> > -              */
>> > -             if (unlikely(is_global_init(tsk)))
>> > -                     panic("Attempted to kill init! exitcode=0x%08x\n",
>> > -                             tsk->signal->group_exit_code ?: (int)code);
>> > -
>> >  #ifdef CONFIG_POSIX_TIMERS
>> >               hrtimer_cancel(&tsk->signal->real_timer);
>> >               exit_itimers(tsk->signal);
>> > --
>> > 1.9.1
>> >
>>