lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 4 Oct 2019 10:29:17 +0100
From:   Russell King - ARM Linux admin <linux@...linux.org.uk>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Will Deacon <will@...nel.org>, Kees Cook <keescook@...omium.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Feng Tang <feng.tang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-arm-kernel@...ts.infradead.org,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org, contact@...ium.me
Subject: Re: [PATCH] panic: Ensure preemption is disabled during panic()

On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote:
> On Thu 2019-10-03 21:56:34, Will Deacon wrote:
> > Hi Kees,
> > 
> > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote:
> > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote:
> > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the
> > > > calling CPU in an infinite loop, but with interrupts and preemption
> > > > enabled. From this state, userspace can continue to be scheduled,
> > > > despite the system being "dead" as far as the kernel is concerned. This
> > > > is easily reproducible on arm64 when booting with "nosmp" on the command
> > > > line; a couple of shell scripts print out a periodic "Ping" message
> > > > whilst another triggers a crash by writing to /proc/sysrq-trigger:
> > > > 
> > > >   | sysrq: Trigger a crash
> > > >   | Kernel panic - not syncing: sysrq triggered crash
> > > >   | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1
> > > >   | Hardware name: linux,dummy-virt (DT)
> > > >   | Call trace:
> > > >   |  dump_backtrace+0x0/0x148
> > > >   |  show_stack+0x14/0x20
> > > >   |  dump_stack+0xa0/0xc4
> > > >   |  panic+0x140/0x32c
> > > >   |  sysrq_handle_reboot+0x0/0x20
> > > >   |  __handle_sysrq+0x124/0x190
> > > >   |  write_sysrq_trigger+0x64/0x88
> > > >   |  proc_reg_write+0x60/0xa8
> > > >   |  __vfs_write+0x18/0x40
> > > >   |  vfs_write+0xa4/0x1b8
> > > >   |  ksys_write+0x64/0xf0
> > > >   |  __arm64_sys_write+0x14/0x20
> > > >   |  el0_svc_common.constprop.0+0xb0/0x168
> > > >   |  el0_svc_handler+0x28/0x78
> > > >   |  el0_svc+0x8/0xc
> > > >   | Kernel Offset: disabled
> > > >   | CPU features: 0x0002,24002004
> > > >   | Memory Limit: none
> > > >   | ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
> > > >   |  Ping 2!
> > > >   |  Ping 1!
> > > >   |  Ping 1!
> > > >   |  Ping 2!
> > > > 
> > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise
> > > > local interrupts are disabled in 'smp_send_stop()'.
> > > > 
> > > > Disable preemption in 'panic()' before re-enabling interrupts.
> > > 
> > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic:
> > > avoid the extra noise dmesg") was trying to fix?
> > 
> > Hmm, maybe, although that looks like it's focussed more on irq handling
> > than preemption.
> 
> Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid
> the extra noise dmesg") is printed by wake_up() called from
> wake_up_klogd_work_func(). It is irq_work. Therefore disabling
> preemption would not prevent this.
> 
> 
> > I've deliberately left the irq part alone, since I think
> > having magic sysrq work via the keyboard interrupt is desirable from the
> > panic loop.
> 
> I agree that we should keep sysrq working.
> 
> One pity thing is that led_panic_blink() in
> leds/drivers/trigger/ledtrig-panic.c uses workqueues:
> 
>   + led_panic_blink()
>     + led_trigger_event()
>       + led_set_brightness()
> 	+ schedule_work()
> 
> It means that it depends on the scheduler. I guess that it
> does not work in many panic situations. But this patch
> will always block it.
> 
> I agree that it is strange that userspace still works at
> this stage. But does it cause any real problems?

Yes, there are watchdog drivers that continue to pat their watchdog
after the kernel has panic'd.  It makes watchdogs useless (which is
exactly how this problem was discovered.)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ