linux-kernel - Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191214062704.GA5580@cqw-OptiPlex-7050>
Date:   Sat, 14 Dec 2019 14:27:04 +0800
From:   chenqiwu <qiwuchen55@...il.com>
To:     Christian Brauner <christian.brauner@...ntu.com>,
        peterz@...radead.org, mingo@...nel.org
Cc:     kernel-team@...roid.com, linux-kernel@...r.kernel.org,
        chenqiwu@...omi.com
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global
 init task exit

On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > can't you use is_global_init() && group_dead ?
> 
> Seems reasonable.
> Looks like we can move
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> further up...
> 
> (Ideally I'd like to have a test for this to ensure that this lets you
> capture a global init coredump but that might be tricky. But since you've
> seem to have run into this case maybe you even have something that could
> be turned into a test? (Similar to how we already have a purely opt-in
> test for pstore.))
> 
> Christian

Hi all,
I agree that using is_global_init() && group_dead is more reasonable.

The crash isuee happened on a Android phone by reboot stress test.
panic log:
[   84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   84.048521]
[   84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S         O    4.14.117-perf-g8035d1a #1
[   84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
[   84.048550] Call trace:
[   84.048564]  dump_backtrace+0x0/0x268
[   84.048569]  show_stack+0x14/0x20
[   84.048577]  dump_stack+0xc4/0x100
[   84.048584]  panic+0x1f0/0x410
[   84.048591]  complete_and_exit+0x0/0x20
[   84.048596]  do_group_exit+0x8c/0xa0
[   84.048602]  get_signal+0x1c0/0x790
[   84.048608]  do_notify_resume+0x184/0xc30
[   84.048613]  work_pending+0x8/0x10

>From the kdump loaded by crash utility, all threads of global init have exited,
the group_dead value of global init has truned to 0 by atomic_dec_and_test().
crash> ps init
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>     1      0   2  ffffffcd77526000  ??   0.0       0      0  init
    534      1   4  ffffffcd6b9a9000  ZO   0.0       0      0  init
    535      1   4  ffffffcd6b9aa000  ZO   0.0       0      0  init
crash> ps -g 1
PID: 1      TASK: ffffffcd77526000  CPU: 2   COMMAND: "init"
  (no threads)
crash> struct task_struct.signal ffffffcd77526000
  signal = 0xffffffcd77530000
crash> struct signal_struct 0xffffffcd77530000
struct signal_struct {
  sigcnt = {
    counter = 1
  },
  live = {
    counter = 0
  },
  nr_threads = 1,
  thread_head = {
    next = 0xffffffcd77526730,
    prev = 0xffffffcd77526730
  },
  group_exit_code = 11,
  notify_count = 0,
  group_exit_task = 0x0,
  group_stop_count = 0,
  flags = 4,
  ...
 }

However, as Christian said, the test for this is tricky since we must
make sure all of init threads exited. I make a test for is_global_init()
and send a SIGSEGV signal to global init task in userspace. The phone
crash imeddiately and reboot to collect kdump. Then I extract the coredump
of global init task from kdump successfully.
(gdb) core-file core.1.init
Core was generated by `/system/bin/init second_stage'.
#0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
9           cmn     x0, #(MAX_ERRNO + 1)
(gdb) bt
#0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
#1  0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
    at system/core/init/reboot_utils.cpp:141
#2  0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
#3  0x0000007f8de236a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

So from the following test, we have confidence that the following patch can help us
to get the extra coredump of init when global init task do real exit.

diff --git a/kernel/exit.c b/kernel/exit.c
index 8e288e8..9454106 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -476,10 +476,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
        }

        write_unlock_irq(&tasklist_lock);
-       if (unlikely(pid_ns == &init_pid_ns)) {
-               panic("Attempted to kill init! exitcode=0x%08x\n",
-                       father->signal->group_exit_code ?: father->exit_code);
-       }

        list_for_each_entry_safe(p, n, dead, ptrace_entry) {
                list_del_init(&p->ptrace_entry);
@@ -687,6 +683,10 @@ void do_exit(long code)
        if (unlikely(!tsk->pid))
                panic("Attempted to kill the idle task!");

+       group_dead = atomic_dec_and_test(&tsk->signal->live);
+       if (unlikely(is_global_init(tsk) && group_dead))
+               panic("Attempted to kill init! exitcode=0x%08lx\n", code);
+
        /*
         * If do_exit is called because this processes oopsed, it's possible
         * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
@@ -743,7 +743,6 @@ void do_exit(long code)
        if (tsk->mm)
                sync_mm_rss(tsk->mm);
        acct_update_integrals(tsk);
-       group_dead = atomic_dec_and_test(&tsk->signal->live);
        if (group_dead) {
                hrtimer_cancel(&tsk->signal->real_timer);
                exit_itimers(tsk->signal);