[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yt9dpn3v3u1m.fsf@linux.ibm.com>
Date: Mon, 30 Nov 2020 13:31:33 +0100
From: Sven Schnelle <svens@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6
Hi,
Sven Schnelle <svens@...ux.ibm.com> writes:
> Hi Peter,
>
> Peter Zijlstra <peterz@...radead.org> writes:
>
>> On Sun, Nov 29, 2020 at 11:31:41AM -0800, Linus Torvalds wrote:
>>> On Sun, Nov 29, 2020 at 5:38 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>>> >
>>> > Yet two more places which invoke tracing from RCU disabled regions in the
>>> > idle path. Similar to the entry path the low level idle functions have to
>>> > be non-instrumentable.
>>>
>>> This really seems less than optimal.
>>>
>>> In particular, lookie here:
>>>
>>> > @@ -94,9 +94,35 @@ void __cpuidle default_idle_call(void)
>>> >
>>> > trace_cpu_idle(1, smp_processor_id());
>>> > stop_critical_timings();
>>> > +
>>> > + /*
>>> > + * arch_cpu_idle() is supposed to enable IRQs, however
>>> > + * we can't do that because of RCU and tracing.
>>> > + *
>>> > + * Trace IRQs enable here, then switch off RCU, and have
>>> > + * arch_cpu_idle() use raw_local_irq_enable(). Note that
>>> > + * rcu_idle_enter() relies on lockdep IRQ state, so switch that
>>> > + * last -- this is very similar to the entry code.
>>> > + */
>>> > + trace_hardirqs_on_prepare();
>>> > + lockdep_hardirqs_on_prepare(_THIS_IP_);
>>> > rcu_idle_enter();
>>> > + lockdep_hardirqs_on(_THIS_IP_);
>>> > +
>>> > arch_cpu_idle();
>>> > +
>>> > + /*
>>> > + * OK, so IRQs are enabled here, but RCU needs them disabled to
>>> > + * turn itself back on.. funny thing is that disabling IRQs
>>> > + * will cause tracing, which needs RCU. Jump through hoops to
>>> > + * make it 'work'.
>>> > + */
>>> > + raw_local_irq_disable();
>>> > + lockdep_hardirqs_off(_THIS_IP_);
>>> > rcu_idle_exit();
>>> > + lockdep_hardirqs_on(_THIS_IP_);
>>> > + raw_local_irq_enable();
>>> > +
>>> > start_critical_timings();
>>> > trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
>>> > }
>>>
>>> And look at what the code generation for the idle exit path is when
>>> lockdep isn't even on.
>>
>> Agreed.
>>
>> The idea was to flip all of arch_cpu_idle() to not enable interrupts.
>>
>> This is suboptimal for things like x86 where arch_cpu_idle() is
>> basically STI;HLT, but x86 isn't likely to actually use this code path
>> anyway, given all the various cpuidle drivers it has.
>>
>> Many of the other archs are now doing things like arm's:
>> wfi();raw_local_irq_enable().
>>
>> Doing that tree-wide interrupt-state flip was something I didn't want to
>> do at this late a stage, the chanse of messing that up is just too high.
>>
>> After that I need to go look at flipping cpuidle, which is even more
>> 'interesting'. cpuidle_enter() has the exact same semantics, and this is
>> the code path that x86 actually uses, and here it's inconsitent at best.
>
> On s390 this introduces the following splat:
> [..]
I sent you the wrong backtrace. This is the correct one:
[ 0.667491] smp: Bringing up secondary CPUs ...
[ 0.670262] random: get_random_bytes called from __warn+0x12a/0x160 with crng_init=1
[ 0.670280] ------------[ cut here ]------------
[ 0.670288] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:1054 rcu_irq_enter+0x7e/0xa8
[ 0.670293] Modules linked in:
[ 0.670299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 5.10.0-rc6 #2263
[ 0.670304] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 0.670309] Krnl PSW : 0404d00180000000 0000000000d8a8da (rcu_irq_enter+0x82/0xa8)
[ 0.670318] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[ 0.670325] Krnl GPRS: 0000000000000000 0000000080000002 0000000000000001 000000000101fcee
[ 0.670331] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.670337] 000003e00029ff48 0000000000000000 00000000017212d8 0000000000000001
[ 0.670343] 0000000005ba0100 00000000000324bb 000003e00029fe40 000003e00029fe10
[ 0.670358] Krnl Code: 0000000000d8a8ca: ec180013017e cij %r1,1,8,0000000000d8a8f0
[ 0.670358] 0000000000d8a8d0: ecb80005007e cij %r11,0,8,0000000000d8a8da
[ 0.670358] #0000000000d8a8d6: af000000 mc 0,0
[ 0.670358] >0000000000d8a8da: ebbff0a00004 lmg %r11,%r15,160(%r15)
[ 0.670358] 0000000000d8a8e0: c0f4ffffff68 brcl 15,0000000000d8a7b0
[ 0.670358] 0000000000d8a8e6: c0e5000038c1 brasl %r14,0000000000d91a68
[ 0.670358] 0000000000d8a8ec: a7f4ffdc brc 15,0000000000d8a8a4
[ 0.670358] 0000000000d8a8f0: c0e5000038bc brasl %r14,0000000000d91a68
[ 0.670392] Call Trace:
[ 0.670396] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8
[ 0.670401] [<0000000000157f9a>] irq_enter+0x22/0x30
[ 0.670404] [<000000000010e51c>] do_IRQ+0x64/0xd0
[ 0.670408] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194
[ 0.670412] [<0000000000d9a6a0>] psw_idle+0x40/0x48
[ 0.670416] ([<0000000000104202>] enabled_wait+0x22/0xf0)
[ 0.670419] [<00000000001046e2>] arch_cpu_idle+0x22/0x38
[ 0.670423] [<0000000000d986cc>] default_idle_call+0x74/0xd8
[ 0.670427] [<000000000019a94a>] do_idle+0xf2/0x1b0
[ 0.670431] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40
[ 0.670435] [<0000000000118b9a>] smp_start_secondary+0x82/0x88
[ 0.670438] INFO: lockdep is turned off.
[ 0.670441] Last Breaking-Event-Address:
[ 0.670444] [<0000000000157f94>] irq_enter+0x1c/0x30
[ 0.670447] irq event stamp: 19
[ 0.670451] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8
[ 0.670455] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0
[ 0.670459] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0
[ 0.670463] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.670466] ---[ end trace cba3783aedff6f79 ]---
[ 0.670685] smp: Brought up 1 node, 2 CPUs
[ 0.676300]
[ 0.676301] =============================
[ 0.676302] WARNING: suspicious RCU usage
[ 0.676303] 5.10.0-rc6 #2263 Not tainted
[ 0.676303] -----------------------------
[ 0.676304] include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage!
[ 0.676305]
[ 0.676306] other info that might help us debug this:
[ 0.676306]
[ 0.676307]
[ 0.676307] rcu_scheduler_active = 1, debug_locks = 1
[ 0.676308] RCU used illegally from extended quiescent state!
[ 0.676309] no locks held by swapper/1/0.
[ 0.676309]
[ 0.676310] stack backtrace:
[ 0.676310] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263
[ 0.676311] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 0.676312] Call Trace:
[ 0.676312] [<0000000000d726f8>] show_stack+0x90/0xf8
[ 0.676313] [<0000000000d855f2>] dump_stack+0xa2/0xd8
[ 0.676314] [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8
[ 0.676314] [<00000000001d7c44>] lock_acquire+0x44/0x88
[ 0.676315] [<0000000000d98788>] _raw_spin_lock+0x58/0xa8
[ 0.676316] [<00000000001e4e94>] vprintk_emit+0x74/0x1a8
[ 0.676316] [<00000000001e4ffe>] vprintk_default+0x36/0x48
[ 0.676317] [<0000000000d76526>] printk+0x46/0x58
[ 0.676318] [<00000000009ae188>] report_bug+0x110/0x130
[ 0.676318] [<000000000010215c>] monitor_event_exception+0x44/0xc0
[ 0.676319] [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238
[ 0.676320] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8
[ 0.676321] ------------[ cut here ]------------
[ 0.676321] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
[ 0.676322] WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:5281 check_flags.part.0+0x1cc/0x208
[ 0.676323] Modules linked in:
[ 0.676324] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263
[ 0.676325] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 0.676325] Krnl PSW : 0404c00180000000 00000000001d79d0 (check_flags.part.0+0x1d0/0x208)
[ 0.676327] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 0.676329] Krnl GPRS: 00000000ffffffe5 000000000116dbd4 000000000000002f 0000000000fe8c00
[ 0.676330] 0000000000000000 00000000009ad0e8 0000000000000000 0000000000000000
[ 0.676330] 00000000012d7f20 0000000000000000 040003e000000001 000000000257f310
[ 0.676331] 0000000005ba0100 00000000011725b0 00000000001d79cc 000003e00029f638
[ 0.676332] Krnl Code: 00000000001d79c0: c02000708989 larl %r2,0000000000fe8cd2
[ 0.676333] 00000000001d79c6: c0e5005cde09 brasl %r14,0000000000d735d8
[ 0.676334] #00000000001d79cc: af000000 mc 0,0
[ 0.676335] >00000000001d79d0: c0200070d844 larl %r2,0000000000ff2a58
[ 0.676337] 00000000001d79d6: c0e5005cf585 brasl %r14,0000000000d764e0
[ 0.676338] 00000000001d79dc: a7f4ff3f brc 15,00000000001d785a
[ 0.676339] 00000000001d79e0: c0e5005dd044 brasl %r14,0000000000d91a68
[ 0.676340] 00000000001d79e6: a7f4ff6f brc 15,00000000001d78c4
[ 0.676341] Call Trace:
[ 0.676342] [<00000000001d79d0>] check_flags.part.0+0x1d0/0x208
[ 0.676343] ([<00000000001d79cc>] check_flags.part.0+0x1cc/0x208)
[ 0.676344] [<00000000001d7a9e>] lock_acquire.part.0+0x96/0x1f8
[ 0.676344] [<00000000001d7c7c>] lock_acquire+0x7c/0x88
[ 0.676345] [<000000000027bce4>] __is_insn_slot_addr+0x5c/0x150
[ 0.676346] [<000000000017dbf2>] kernel_text_address+0x11a/0x130
[ 0.676346] [<000000000017dc2e>] __kernel_text_address+0x26/0x70
[ 0.676347] [<000000000011843c>] unwind_next_frame+0x104/0x1c0
[ 0.676348] [<0000000000d72738>] show_stack+0xd0/0xf8
[ 0.676348] [<0000000000d855f2>] dump_stack+0xa2/0xd8
[ 0.676349] [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8
[ 0.676350] [<00000000001d7c44>] lock_acquire+0x44/0x88
[ 0.676350] [<0000000000d98788>] _raw_spin_lock+0x58/0xa8
[ 0.676351] [<00000000001e4e94>] vprintk_emit+0x74/0x1a8
[ 0.676352] [<00000000001e4ffe>] vprintk_default+0x36/0x48
[ 0.676352] [<0000000000d76526>] printk+0x46/0x58
[ 0.676353] [<00000000009ae188>] report_bug+0x110/0x130
[ 0.676354] [<000000000010215c>] monitor_event_exception+0x44/0xc0
[ 0.676354] [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238
[ 0.676355] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8
[ 0.676356] [<0000000000157f9a>] irq_enter+0x22/0x30
[ 0.676356] [<000000000010e51c>] do_IRQ+0x64/0xd0
[ 0.676357] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194
[ 0.676358] [<0000000000d9a6a0>] psw_idle+0x40/0x48
[ 0.676358] ([<0000000000104202>] enabled_wait+0x22/0xf0)
[ 0.676359] [<00000000001046e2>] arch_cpu_idle+0x22/0x38
[ 0.676360] [<0000000000d986cc>] default_idle_call+0x74/0xd8
[ 0.676360] [<000000000019a94a>] do_idle+0xf2/0x1b0
[ 0.676361] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40
[ 0.676362] [<0000000000118b9a>] smp_start_secondary+0x82/0x88
[ 0.676362] INFO: lockdep is turned off.
[ 0.676363] Last Breaking-Event-Address:
[ 0.676364] [<0000000000d9c4d0>] __s390_indirect_jump_r14+0x0/0xc
[ 0.676364] irq event stamp: 19
[ 0.676365] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8
[ 0.676366] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0
[ 0.676367] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0
[ 0.676367] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.676368] ---[ end trace cba3783aedff6f78 ]---
[ 0.676369] possible reason: unannotated irqs-off.
[ 0.676369] irq event stamp: 19
[ 0.676370] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8
[ 0.676371] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0
[ 0.676372] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0
[ 0.676372] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.676373] [<0000000000157f9a>] irq_enter+0x22/0x30
[ 0.676374] [<000000000010e51c>] do_IRQ+0x64/0xd0
[ 0.676374] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194
[ 0.676375] [<0000000000d9a6a0>] psw_idle+0x40/0x48
[ 0.676376] ([<0000000000104202>] enabled_wait+0x22/0xf0)
[ 0.676376] [<00000000001046e2>] arch_cpu_idle+0x22/0x38
[ 0.676377] [<0000000000d986cc>] default_idle_call+0x74/0xd8
[ 0.676378] [<000000000019a94a>] do_idle+0xf2/0x1b0
[ 0.676378] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40
[ 0.676379] [<0000000000118b9a>] smp_start_secondary+0x82/0x88
[ 0.676379] INFO: lockdep is turned off.
Powered by blists - more mailing lists