lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 Mar 2020 09:59:38 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Kris Karas <linux-1993@...nlit-rail.com>
Cc:     syzbot <syzbot+0c5c2dbf76930df91489@...kaller.appspotmail.com>,
        David Miller <davem@...emloft.net>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        "open list:HARDWARE RANDOM NUMBER GENERATOR CORE" 
        <linux-crypto@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Eric Biggers <ebiggers@...nel.org>, allison@...utok.net
Subject: Re: INFO: rcu detected stall in sys_keyctl

On Wed, Mar 4, 2020 at 9:41 AM Kris Karas <linux-1993@...nlit-rail.com> wrote:
>
> Resending this to all the original CCs per suggestion of Dmitry.
> I'm not a member of linux-crypto, no idea if it will bounce; in any
> case, the OOPS I saw does not appear to be crypto related.
>
> Dmitry Vyukov wrote:
> > syzbot wrote:
> >> Call Trace:
> >>   <IRQ>
> >>   __dump_stack lib/dump_stack.c:77 [inline]
> >>   dump_stack+0x197/0x210 lib/dump_stack.c:118
> >>   nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> >>   nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> >>   arch_trigger_cpumask_backtrace+0x14/0x20
> >> arch/x86/kernel/apic/hw_nmi.c:38
> >>   trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
> >>   rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree_stall.h:254
> >>   print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
> >>   check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
> >>   rcu_pending kernel/rcu/tree.c:3030 [inline]
> >>   rcu_sched_clock_irq.cold+0x51a/0xc37 kernel/rcu/tree.c:2276
> >>   update_process_times+0x2d/0x70 kernel/time/timer.c:1726
> >>   tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:171
> >>   tick_sched_timer+0x53/0x140 kernel/time/tick-sched.c:1314
> >>   __run_hrtimer kernel/time/hrtimer.c:1517 [inline]
> >>   __hrtimer_run_queues+0x364/0xe40 kernel/time/hrtimer.c:1579
> >>   hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1641
> >>   local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
> >>   smp_apic_timer_interrupt+0x160/0x610 arch/x86/kernel/apic/apic.c:1144
> >>   apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
> >>   </IRQ>
> >>
> > +lib/mpi maintainers
> >
> > I wonder if this can also be triggered by remote actors (tls, wifi,
> > usb, etc).
> >
>
> This looks somewhat similar to an OOPS + rcu stall I reported earlier in
> reply to Greg KH's announcement of 5.5.7:
>
>      rcu: INFO: rcu_sched self-detected stall on CPU
>      rcu:    14-....: (20999 ticks this GP)
> idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
>              (t=21004 jiffies g=-755 q=1327)
>      NMI backtrace for cpu 14
>      CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
>      Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470
> Taichi, BIOS P3.50 07/18/2019
>      Call Trace:
>       <IRQ>
>       dump_stack+0x50/0x70
>       nmi_cpu_backtrace.cold+0x14/0x53
>       ? lapic_can_unplug_cpu.cold+0x44/0x44
>       nmi_trigger_cpumask_backtrace+0x7b/0x88
>       rcu_dump_cpu_stacks+0x7b/0xa9
>       rcu_sched_clock_irq.cold+0x152/0x39b
>       update_process_times+0x1f/0x50
>       tick_sched_timer+0x40/0x90
>       ? tick_sched_do_timer+0x50/0x50
>       __hrtimer_run_queues+0xdd/0x180
>       hrtimer_interrupt+0x108/0x230
>       smp_apic_timer_interrupt+0x53/0xa0
>       apic_timer_interrupt+0xf/0x20
>       </IRQ>
>
> I don't have a reproducer for it, either.  It showed up in 5.5.7 (but
> might be from earlier as it reproduces so infrequently).

Hi Kris,

What follows after this stack? That's the most interesting part. The
part that you showed is common for all stalls and does not mean much,
besides the fact that there is a stall. These can well be very
different stalls in different parts of kernel.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ