lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yt9dpn3v3u1m.fsf@linux.ibm.com>
Date:   Mon, 30 Nov 2020 13:31:33 +0100
From:   Sven Schnelle <svens@...ux.ibm.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6

Hi,

Sven Schnelle <svens@...ux.ibm.com> writes:

> Hi Peter,
>
> Peter Zijlstra <peterz@...radead.org> writes:
>
>> On Sun, Nov 29, 2020 at 11:31:41AM -0800, Linus Torvalds wrote:
>>> On Sun, Nov 29, 2020 at 5:38 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>>> >
>>> > Yet two more places which invoke tracing from RCU disabled regions in the
>>> > idle path. Similar to the entry path the low level idle functions have to
>>> > be non-instrumentable.
>>> 
>>> This really seems less than optimal.
>>> 
>>> In particular, lookie here:
>>> 
>>> > @@ -94,9 +94,35 @@ void __cpuidle default_idle_call(void)
>>> >
>>> >                 trace_cpu_idle(1, smp_processor_id());
>>> >                 stop_critical_timings();
>>> > +
>>> > +               /*
>>> > +                * arch_cpu_idle() is supposed to enable IRQs, however
>>> > +                * we can't do that because of RCU and tracing.
>>> > +                *
>>> > +                * Trace IRQs enable here, then switch off RCU, and have
>>> > +                * arch_cpu_idle() use raw_local_irq_enable(). Note that
>>> > +                * rcu_idle_enter() relies on lockdep IRQ state, so switch that
>>> > +                * last -- this is very similar to the entry code.
>>> > +                */
>>> > +               trace_hardirqs_on_prepare();
>>> > +               lockdep_hardirqs_on_prepare(_THIS_IP_);
>>> >                 rcu_idle_enter();
>>> > +               lockdep_hardirqs_on(_THIS_IP_);
>>> > +
>>> >                 arch_cpu_idle();
>>> > +
>>> > +               /*
>>> > +                * OK, so IRQs are enabled here, but RCU needs them disabled to
>>> > +                * turn itself back on.. funny thing is that disabling IRQs
>>> > +                * will cause tracing, which needs RCU. Jump through hoops to
>>> > +                * make it 'work'.
>>> > +                */
>>> > +               raw_local_irq_disable();
>>> > +               lockdep_hardirqs_off(_THIS_IP_);
>>> >                 rcu_idle_exit();
>>> > +               lockdep_hardirqs_on(_THIS_IP_);
>>> > +               raw_local_irq_enable();
>>> > +
>>> >                 start_critical_timings();
>>> >                 trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
>>> >         }
>>> 
>>> And look at what the code generation for the idle exit path is when
>>> lockdep isn't even on.
>>
>> Agreed.
>>
>> The idea was to flip all of arch_cpu_idle() to not enable interrupts.
>>
>> This is suboptimal for things like x86 where arch_cpu_idle() is
>> basically STI;HLT, but x86 isn't likely to actually use this code path
>> anyway, given all the various cpuidle drivers it has.
>>
>> Many of the other archs are now doing things like arm's:
>> wfi();raw_local_irq_enable().
>>
>> Doing that tree-wide interrupt-state flip was something I didn't want to
>> do at this late a stage, the chanse of messing that up is just too high.
>>
>> After that I need to go look at flipping cpuidle, which is even more
>> 'interesting'. cpuidle_enter() has the exact same semantics, and this is
>> the code path that x86 actually uses, and here it's inconsitent at best.
>
> On s390 this introduces the following splat:
> [..]

I sent you the wrong backtrace. This is the correct one:

[    0.667491] smp: Bringing up secondary CPUs ... 
[    0.670262] random: get_random_bytes called from __warn+0x12a/0x160 with crng_init=1 
[    0.670280] ------------[ cut here ]------------ 
[    0.670288] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:1054 rcu_irq_enter+0x7e/0xa8 
[    0.670293] Modules linked in: 
[    0.670299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W         5.10.0-rc6 #2263 
[    0.670304] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) 
[    0.670309] Krnl PSW : 0404d00180000000 0000000000d8a8da (rcu_irq_enter+0x82/0xa8) 
[    0.670318]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 
[    0.670325] Krnl GPRS: 0000000000000000 0000000080000002 0000000000000001 000000000101fcee 
[    0.670331]            0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    0.670337]            000003e00029ff48 0000000000000000 00000000017212d8 0000000000000001 
[    0.670343]            0000000005ba0100 00000000000324bb 000003e00029fe40 000003e00029fe10
 
[    0.670358] Krnl Code: 0000000000d8a8ca: ec180013017e        cij     %r1,1,8,0000000000d8a8f0 
[    0.670358]            0000000000d8a8d0: ecb80005007e        cij     %r11,0,8,0000000000d8a8da 
[    0.670358]           #0000000000d8a8d6: af000000            mc      0,0 
[    0.670358]           >0000000000d8a8da: ebbff0a00004        lmg     %r11,%r15,160(%r15) 
[    0.670358]            0000000000d8a8e0: c0f4ffffff68        brcl    15,0000000000d8a7b0 
[    0.670358]            0000000000d8a8e6: c0e5000038c1        brasl   %r14,0000000000d91a68 
[    0.670358]            0000000000d8a8ec: a7f4ffdc            brc     15,0000000000d8a8a4 
[    0.670358]            0000000000d8a8f0: c0e5000038bc        brasl   %r14,0000000000d91a68 
[    0.670392] Call Trace: 
[    0.670396]  [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8  
[    0.670401]  [<0000000000157f9a>] irq_enter+0x22/0x30  
[    0.670404]  [<000000000010e51c>] do_IRQ+0x64/0xd0  
[    0.670408]  [<0000000000d9a65a>] ext_int_handler+0x18e/0x194  
[    0.670412]  [<0000000000d9a6a0>] psw_idle+0x40/0x48  
[    0.670416] ([<0000000000104202>] enabled_wait+0x22/0xf0) 
[    0.670419]  [<00000000001046e2>] arch_cpu_idle+0x22/0x38  
[    0.670423]  [<0000000000d986cc>] default_idle_call+0x74/0xd8  
[    0.670427]  [<000000000019a94a>] do_idle+0xf2/0x1b0  
[    0.670431]  [<000000000019ac7e>] cpu_startup_entry+0x36/0x40  
[    0.670435]  [<0000000000118b9a>] smp_start_secondary+0x82/0x88  
[    0.670438] INFO: lockdep is turned off. 
[    0.670441] Last Breaking-Event-Address: 
[    0.670444]  [<0000000000157f94>] irq_enter+0x1c/0x30 
[    0.670447] irq event stamp: 19 
[    0.670451] hardirqs last  enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 
[    0.670455] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 
[    0.670459] softirqs last  enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 
[    0.670463] softirqs last disabled at (0): [<0000000000000000>] 0x0 
[    0.670466] ---[ end trace cba3783aedff6f79 ]--- 
[    0.670685] smp: Brought up 1 node, 2 CPUs 
[    0.676300]  
[    0.676301] ============================= 
[    0.676302] WARNING: suspicious RCU usage 
[    0.676303] 5.10.0-rc6 #2263 Not tainted 
[    0.676303] ----------------------------- 
[    0.676304] include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage! 
[    0.676305]  
[    0.676306] other info that might help us debug this: 
[    0.676306]  
[    0.676307]  
[    0.676307] rcu_scheduler_active = 1, debug_locks = 1 
[    0.676308] RCU used illegally from extended quiescent state! 
[    0.676309] no locks held by swapper/1/0. 
[    0.676309]  
[    0.676310] stack backtrace: 
[    0.676310] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263 
[    0.676311] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) 
[    0.676312] Call Trace: 
[    0.676312]  [<0000000000d726f8>] show_stack+0x90/0xf8  
[    0.676313]  [<0000000000d855f2>] dump_stack+0xa2/0xd8  
[    0.676314]  [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8  
[    0.676314]  [<00000000001d7c44>] lock_acquire+0x44/0x88  
[    0.676315]  [<0000000000d98788>] _raw_spin_lock+0x58/0xa8 
 
[    0.676316]  [<00000000001e4e94>] vprintk_emit+0x74/0x1a8  
[    0.676316]  [<00000000001e4ffe>] vprintk_default+0x36/0x48  
[    0.676317]  [<0000000000d76526>] printk+0x46/0x58  
[    0.676318]  [<00000000009ae188>] report_bug+0x110/0x130  
[    0.676318]  [<000000000010215c>] monitor_event_exception+0x44/0xc0  
[    0.676319]  [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238  
[    0.676320]  [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8  
[    0.676321] ------------[ cut here ]------------ 
[    0.676321] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) 
[    0.676322] WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:5281 check_flags.part.0+0x1cc/0x208 
[    0.676323] Modules linked in: 
[    0.676324] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263 
[    0.676325] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) 
[    0.676325] Krnl PSW : 0404c00180000000 00000000001d79d0 (check_flags.part.0+0x1d0/0x208) 
[    0.676327]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 
[    0.676329] Krnl GPRS: 00000000ffffffe5 000000000116dbd4 000000000000002f 0000000000fe8c00 
[    0.676330]            0000000000000000 00000000009ad0e8 0000000000000000 0000000000000000 
[    0.676330]            00000000012d7f20 0000000000000000 040003e000000001 000000000257f310 
[    0.676331]            0000000005ba0100 00000000011725b0 00000000001d79cc 000003e00029f638 
[    0.676332] Krnl Code: 00000000001d79c0: c02000708989        larl    %r2,0000000000fe8cd2 
[    0.676333]            00000000001d79c6: c0e5005cde09        brasl   %r14,0000000000d735d8 
[    0.676334]           #00000000001d79cc: af000000            mc      0,0 
[    0.676335]           >00000000001d79d0: c0200070d844        larl    %r2,0000000000ff2a58 
[    0.676337]            00000000001d79d6: c0e5005cf585        brasl   %r14,0000000000d764e0 
[    0.676338]            00000000001d79dc: a7f4ff3f            brc     15,00000000001d785a 
[    0.676339]            00000000001d79e0: c0e5005dd044        brasl   %r14,0000000000d91a68 
[    0.676340]            00000000001d79e6: a7f4ff6f            brc     15,00000000001d78c4 
[    0.676341] Call Trace: 
[    0.676342]  [<00000000001d79d0>] check_flags.part.0+0x1d0/0x208  
[    0.676343] ([<00000000001d79cc>] check_flags.part.0+0x1cc/0x208) 
[    0.676344]  [<00000000001d7a9e>] lock_acquire.part.0+0x96/0x1f8  
[    0.676344]  [<00000000001d7c7c>] lock_acquire+0x7c/0x88  
[    0.676345]  [<000000000027bce4>] __is_insn_slot_addr+0x5c/0x150  
[    0.676346]  [<000000000017dbf2>] kernel_text_address+0x11a/0x130  
[    0.676346]  [<000000000017dc2e>] __kernel_text_address+0x26/0x70  
[    0.676347]  [<000000000011843c>] unwind_next_frame+0x104/0x1c0  
[    0.676348]  [<0000000000d72738>] show_stack+0xd0/0xf8  
[    0.676348]  [<0000000000d855f2>] dump_stack+0xa2/0xd8  
[    0.676349]  [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8  
[    0.676350]  [<00000000001d7c44>] lock_acquire+0x44/0x88  
[    0.676350]  [<0000000000d98788>] _raw_spin_lock+0x58/0xa8  
[    0.676351]  [<00000000001e4e94>] vprintk_emit+0x74/0x1a8  
[    0.676352]  [<00000000001e4ffe>] vprintk_default+0x36/0x48  
[    0.676352]  [<0000000000d76526>] printk+0x46/0x58  
[    0.676353]  [<00000000009ae188>] report_bug+0x110/0x130  
[    0.676354]  [<000000000010215c>] monitor_event_exception+0x44/0xc0  
[    0.676354]  [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238  
[    0.676355]  [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8  
[    0.676356]  [<0000000000157f9a>] irq_enter+0x22/0x30  
[    0.676356]  [<000000000010e51c>] do_IRQ+0x64/0xd0  
[    0.676357]  [<0000000000d9a65a>] ext_int_handler+0x18e/0x194  
[    0.676358]  [<0000000000d9a6a0>] psw_idle+0x40/0x48  
[    0.676358] ([<0000000000104202>] enabled_wait+0x22/0xf0) 
[    0.676359]  [<00000000001046e2>] arch_cpu_idle+0x22/0x38  
[    0.676360]  [<0000000000d986cc>] default_idle_call+0x74/0xd8  
[    0.676360]  [<000000000019a94a>] do_idle+0xf2/0x1b0  
[    0.676361]  [<000000000019ac7e>] cpu_startup_entry+0x36/0x40  
[    0.676362]  [<0000000000118b9a>] smp_start_secondary+0x82/0x88  
[    0.676362] INFO: lockdep is turned off. 
[    0.676363] Last Breaking-Event-Address: 
[    0.676364]  [<0000000000d9c4d0>] __s390_indirect_jump_r14+0x0/0xc 
[    0.676364] irq event stamp: 19 
[    0.676365] hardirqs last  enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 
[    0.676366] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 
[    0.676367] softirqs last  enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 
[    0.676367] softirqs last disabled at (0): [<0000000000000000>] 0x0 
[    0.676368] ---[ end trace cba3783aedff6f78 ]--- 
[    0.676369] possible reason: unannotated irqs-off. 
[    0.676369] irq event stamp: 19 
[    0.676370] hardirqs last  enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 
[    0.676371] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 
[    0.676372] softirqs last  enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 
[    0.676372] softirqs last disabled at (0): [<0000000000000000>] 0x0 
[    0.676373]  [<0000000000157f9a>] irq_enter+0x22/0x30  
[    0.676374]  [<000000000010e51c>] do_IRQ+0x64/0xd0  
[    0.676374]  [<0000000000d9a65a>] ext_int_handler+0x18e/0x194  
[    0.676375]  [<0000000000d9a6a0>] psw_idle+0x40/0x48  
[    0.676376] ([<0000000000104202>] enabled_wait+0x22/0xf0) 
[    0.676376]  [<00000000001046e2>] arch_cpu_idle+0x22/0x38  
[    0.676377]  [<0000000000d986cc>] default_idle_call+0x74/0xd8  
[    0.676378]  [<000000000019a94a>] do_idle+0xf2/0x1b0  
[    0.676378]  [<000000000019ac7e>] cpu_startup_entry+0x36/0x40  
[    0.676379]  [<0000000000118b9a>] smp_start_secondary+0x82/0x88  
[    0.676379] INFO: lockdep is turned off. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ