linux-kernel - Re: WARNING in task_participate_group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171106112508.lun6eftpj5icnvdy@cedar>
Date:   Mon, 6 Nov 2017 11:25:08 +0000
From:   Jamie Iles <jamie.iles@...cle.com>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Oleg Nesterov <oleg@...hat.com>,
        syzbot 
        <bot+c9f0eb0d2a5576ece331a767528e6b52b4ff1815@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Arvind Yadav <arvind.yadav.cs@...il.com>,
        Mark Brown <broonie@...nel.org>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        jamie.iles@...cle.com, LKML <linux-kernel@...r.kernel.org>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        mchehab@...nel.org, Ingo Molnar <mingo@...nel.org>,
        mpe@...erman.id.au, syzkaller-bugs@...glegroups.com,
        Al Viro <viro@...iv.linux.org.uk>, Kyle Huey <me@...ehuey.com>,
        Kees Cook <keescook@...omium.org>
Subject: Re: WARNING in task_participate_group_stop

Hi Dmitry,

On Mon, Nov 06, 2017 at 12:02:19PM +0100, Dmitry Vyukov wrote:
> On Thu, Nov 2, 2017 at 6:01 PM, Oleg Nesterov <oleg@...hat.com> wrote:
> > On 11/01, Dmitry Vyukov wrote:
> >>
> >> On Tue, Oct 31, 2017 at 7:34 PM, Oleg Nesterov <oleg@...hat.com> wrote:
> >> > Hmm. I do not see reproducer in this email...
> >>
> >> Ah, sorry. You can see full thread with attachments here:
> >> https://groups.google.com/forum/#!topic/syzkaller-bugs/EUmYZU4m5gU
> >
> > Heh. I can't say I enjoyed reading the reproducer ;)
> >
> >> >> > WARNING: CPU: 0 PID: 1 at kernel/signal.c:340
> >> >> > task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340
> >> >> > Kernel panic - not syncing: panic_on_warn set ...
> >> >> >
> >> >> > CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-mm1+ #5
> >> >
> >> > So this is init process with SIGNAL_UNKILLABLE flag set. And I hope it has
> >> > the pending SIGKILL, otherwise there is something else.
> >
> > From repro.c
> >
> >         line 111    r[8] = syscall(__NR_ptrace, 0x10ul, r[7]);
> >
> > this is PTRACE_ATTACH
> >
> >         line 115        syscall(__NR_ptrace, 0x4200ul, r[7], 0x40000012ul, 0x100012ul);
> >
> > this is PTRACE_SETOPTIONS and "data" includes PTRACE_O_EXITKILL.
> >
> > r[7] is initialized at
> >
> >         line 110      r[7] = *(uint32_t*)0x20f9cffc;
> >
> > so if it is eq to 1 then it can attach to init and in this case the problem
> > can be explained by the wrong SIGNAL_UNKILLABLE/SIGKILL logic.
> >
> > But how *(uint32_t*)0x20f9cffc can be 1 ?
> >
> >         line 108    r[6] = syscall(__NR_fcntl, r[1], 0x10ul, 0x20f9cff8ul);
> >
> > this is F_GETOWN_EX, addr = 0x20f9cff8 == 0x20f9cffc + 4, so if fcntl()
> > actually succeeds then r[7] == f_owner_ex->pid.
> >
> > It _can_ be 1, but the reproducer doesn't work for me. If you can reproduce,
> > could you try the patch below?
> 
> Hi,
> 
> I would like to understand why you were not able to reproduce it. I
> won't be sitting here all the time, and we are tracking hundreds of
> bugs across different linux kernels and other OSes, so it's
> problematic to do any extensive work on all of them. That's why we try
> to provide reproducers.
> 
> I've just tried the repro on the latest upstream
> (39dae59d66acd86d1de24294bd2f343fd5e7a625) and it triggered the
> WARNING within a second.
> Did you use the config provided? Did you use qemu or real hardware?
> Can you try in qemu (with -smp>1)?

I'm unable to reproduce the warning in qemu with SMP (on a 32 CPU VM).  
Instead I get the following instant traceback which is different to what 
you report when run as root:

[   45.018469] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000013
[   45.018469]
[   45.019669] CPU: 19 PID: 1 Comm: systemd Not tainted 4.14.0-rc8 #7
[   45.021094] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
[   45.022768] Call Trace:
[   45.023076]  dump_stack+0x12e/0x188
[   45.023481]  panic+0x1e4/0x417
[   45.023821]  ? __warn+0x1d9/0x1d9
[   45.024206]  ? _raw_write_unlock_irq+0x27/0x70
[   45.024705]  do_exit+0x27ac/0x2f60
[   45.025101]  ? trace_hardirqs_on+0xd/0x10
[   45.025551]  ? _raw_spin_unlock_irq+0x27/0x70
[   45.026034]  ? mm_update_next_owner+0x640/0x640
[   45.026540]  ? get_signal+0x675/0x1520
[   45.026971]  ? recalc_sigpending+0x72/0x90
[   45.027464]  ? lock_downgrade+0x820/0x820
[   45.027916]  ? __dequeue_signal+0x640/0x640
[   45.028388]  ? _raw_spin_unlock_irq+0x27/0x70
[   45.028877]  do_group_exit+0x108/0x330
[   45.029297]  get_signal+0x61a/0x1520
[   45.031144]  do_signal+0x8d/0x1a10
[   45.031531]  ? trace_hardirqs_on_caller+0x442/0x5c0
[   45.032105]  ? trace_hardirqs_on+0xd/0x10
[   45.032571]  ? setup_sigcontext+0x7d0/0x7d0
[   45.033071]  ? ep_poll_readyevents_proc+0xa0/0xa0
[   45.033619]  ? rw_verify_area+0xe5/0x2b0
[   45.034063]  ? SyS_timerfd_settime+0xe5/0x140
[   45.034551]  ? exit_to_usermode_loop+0x45/0x230
[   45.035065]  exit_to_usermode_loop+0x16a/0x230
[   45.035599]  ? trace_hardirqs_on_caller+0x442/0x5c0
[   45.036833]  syscall_return_slowpath+0x310/0x3d0
[   45.038547]  entry_SYSCALL_64_fastpath+0xbc/0xbe
[   45.039779] RIP: 0033:0x7fd80a914133
[   45.040215] RSP: 002b:00007fff313d0858 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
[   45.041683] RAX: fffffffffffffffc RBX: 000055f47338c050 RCX: 00007fd80a914133
[   45.042451] RDX: 000000000000003d RSI: 00007fff313d0860 RDI: 0000000000000004
[   45.043307] RBP: 00007fff313d0c50 R08: 00007fff313d0860 R09: 8258efee6555c1f9
[   45.044107] R10: 00000000ffffffff R11: 0000000000000246 R12: 00007fff313d0860
[   45.045011] R13: ffffffffffffffff R14: 00007fff313d0c70 R15: 0000000000000001
[   45.046217] Kernel Offset: disabled
[   45.046615] Rebooting in 86400 seconds..

Running the same reproducer as an unprivileged user does not have any 
effect - the system continues to run fine without any warning or panic.

Thanks,

Jamie