lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2c243f59-6d10-7abb-bab4-e7b1796cd54f@jv-coder.de>
Date:   Thu, 28 May 2020 13:41:08 +0200
From:   Joerg Vehlow <lkml@...coder.de>
To:     linux-kernel@...r.kernel.org,
        Joerg Vehlow <joerg.vehlow@...-tech.de>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: [BUG RT] dump-capture kernel not executed for panic in interrupt
 context

Hi,

I think I found a bug in the kernel with rt patches (or maybe even without).
This applies to all kernels propably starting at 2.6.27.

When a kernel panic is triggered from an interrupt handler, the dump-capture
kernel is not started, instead the system acts as if it was not installed.
The reason for this is, that panic calls __crash_kexec, which is protected
by a mutex. On an rt kernel this mutex is an rt mutex and when trylock 
is called
on an rt mutex, the first check is whether the current kthread is in an 
nmi or
irq handler. If it is, the function just returns 0 -> locking failed.

According to rt_mutex_trylock documentation, it is not allowed to call this
function from an irq handler, but panic can be called from everywhere 
and thus
rt_mutex_trylock can be called from everywhere. Actually even 
mutex_trylock has
the comment, that it is not supposed to be used from interrupt context, 
but it
still locks the mutex. I guess this could also be a bug in the non-rt 
kernel.

I found this problem using a test module, that triggers the softlock 
detection.
It is a pretty simple module, that creates a kthread, that disables 
preemption,
spins 60 seconds in an endless loop and then reenables preemption and 
terminates
the thread. This reliably triggers the softlock detection and if
kernel.softlockup_panic=0, the system resumes perfectly fine afterwards. If
kernel.softlockup_panic=1 I would expect the dump-capture kernel to be 
executed,
but it is not due to the bug (without rt patches it works), instead the 
panic
function is executed until the end to the endless loop.


A stacktrace captured at the trylock call inside kexec_code looks like this:
#0  __rt_mutex_trylock (lock=0xffffffff81701aa0 <kexec_mutex>) at 
/usr/src/kernel/kernel/locking/rtmutex.c:2110
#1  0xffffffff8087601a in _mutex_trylock (lock=<optimised out>) at 
/usr/src/kernel/kernel/locking/mutex-rt.c:185
#2  0xffffffff803022a0 in __crash_kexec (regs=0x0 <irq_stack_union>) at 
/usr/src/kernel/kernel/kexec_core.c:941
#3  0xffffffff8027af59 in panic (fmt=0xffffffff80fa3d66 "softlockup: 
hung tasks") at /usr/src/kernel/kernel/panic.c:198
#4  0xffffffff80325b6d in watchdog_timer_fn (hrtimer=<optimised out>) at 
/usr/src/kernel/kernel/watchdog.c:464
#5  0xffffffff802e6b90 in __run_hrtimer (flags=<optimised out>, 
now=<optimised out>, timer=<optimised out>, base=<optimised out>, 
cpu_base=<optimised out>) at /usr/src/kernel/kernel/time/hrtimer.c:1417
#6  __hrtimer_run_queues (cpu_base=0xffff88807db1c000, now=<optimised 
out>, flags=<optimised out>, active_mask=<optimised out>) at 
/usr/src/kernel/kernel/time/hrtimer.c:1479
#7  0xffffffff802e7704 in hrtimer_interrupt (dev=<optimised out>) at 
/usr/src/kernel/kernel/time/hrtimer.c:1539
#8  0xffffffff80a020f2 in local_apic_timer_interrupt () at 
/usr/src/kernel/arch/x86/kernel/apic/apic.c:1067
#9  smp_apic_timer_interrupt (regs=<optimised out>) at 
/usr/src/kernel/arch/x86/kernel/apic/apic.c:1092
#10 0xffffffff80a015df in apic_timer_interrupt () at 
/usr/src/kernel/arch/x86/entry/entry_64.S:909


Obviously and as expected the panic was triggered in the context of the apic
interrupt. So in_irq() is true and trylock fails.


About 12 years ago this was not implemented using a mutex, but using xchg.
See: 8c5a1cf0ad3ac5fcdf51314a63b16a440870f6a2


Since my knowledege about mutexes inside the kernel is very limited, I 
do not
know how this can be fixed and whether it should be fixed in the rt 
patches or
if this really is a bug in mainline kernel (because trylock is also not 
allowed
to be used in interrupt handlers.


Jörg

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ