[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc71306f-2693-0e02-8886-5daf96cfa11d@linux.intel.com>
Date: Mon, 13 Jul 2020 10:23:31 +0800
From: "Li, Aubrey" <aubrey.li@...ux.intel.com>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Vineeth Remanan Pillai <vpillai@...italocean.com>,
Nishanth Aravamudan <naravamudan@...italocean.com>,
Julien Desfossez <jdesfossez@...italocean.com>,
Peter Zijlstra <peterz@...radead.org>,
Tim Chen <tim.c.chen@...ux.intel.com>, mingo@...nel.org,
tglx@...utronix.de, pjt@...gle.com, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org, subhra.mazumdar@...cle.com,
fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com,
Phil Auld <pauld@...hat.com>, Aaron Lu <aaron.lwe@...il.com>,
Aubrey Li <aubrey.intel@...il.com>,
Valentin Schneider <valentin.schneider@....com>,
Mel Gorman <mgorman@...hsingularity.net>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>, vineethrp@...il.com,
Chen Yu <yu.c.chen@...el.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Tim Chen <tim.c.chen@...el.com>,
"Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of
IRQ and softirq
On 2020/7/10 21:21, Joel Fernandes wrote:
> On Fri, Jul 10, 2020 at 08:19:24PM +0800, Li, Aubrey wrote:
>> Hi Joel/Vineeth,
>>
>>
>> The problem is gone when we reverted this patch. We are running multiple
>> uperf threads(equal to cpu number) in a cgroup with coresched enabled.
>> This is 100% reproducible on our side.
>
> Interesting. I am guessing you are not doing any hotplug since those fixes
> were removed from v6 to expose those hotplug issues..
>
> The last known lockups with this patch were fixed. Appreciate if you can dig
> in more and provide logs/traces. The last one I remember was:
>
> HT1 HT2
> irq_enter()
> - sets the core-wide flag
> <softirq running>
> acquires a lock.
> <gets irq>
> irq_enter() - do nothing.
> irq_exit() - busy wait on flag.
> irq_exit()
> <softirq running>
> acquire a lock and deadlock.
>
> The fix was to call sched_core_irq_enter() when you enter enter a softirq
> from paths other than irq_exit().
>
> Other than this one, we have not seen lockups in heavy testing over the last
> 2 months since we redesigned this patch to enter the 'private state' on the
> outer-most core-wide sched_core_irq_enter().
When the first soft lockup panic on CPU75, it's waiting on flush tlb IPI.
[ 170.641645] CPU: 75 PID: 5393 Comm: uperf Kdump: loaded Not tainted 5.7.6+ #3
[ 170.641651] RIP: 0010:smp_call_function_many_cond+0x2b1/0x2e0
----snip----
[ 170.641660] Call Trace:
[ 170.641666] ? flush_tlb_func_common.constprop.10+0x220/0x220
[ 170.641668] ? x86_configure_nx+0x50/0x50
[ 170.641669] ? flush_tlb_func_common.constprop.10+0x220/0x220
[ 170.641670] on_each_cpu_cond_mask+0x2f/0x80
[ 170.641671] flush_tlb_mm_range+0xab/0xe0
[ 170.641677] change_protection+0x18a/0xca0
[ 170.641682] ? __switch_to_asm+0x34/0x70
[ 170.641685] change_prot_numa+0x15/0x30
[ 170.641689] task_numa_work+0x1aa/0x2c0
[ 170.641694] task_work_run+0x76/0xa0
[ 170.641698] exit_to_usermode_loop+0xeb/0xf0
[ 170.641700] do_syscall_64+0x1aa/0x1d0
[ 170.641701] entry_SYSCALL_64_after_hwframe+0x44/0xa9
If I read the code correctly, I assume someone is pending on irq_exit() so IPI
can't return to CPU75, and I found it's CPU91
[ 170.652257] CPU: 91 PID: 5401 Comm: uperf Kdump: loaded Not tainted 5.7.6+ #3
[ 170.652257] RIP: 0010:sched_core_irq_exit+0xcc/0x110
----snip----
[ 170.652261] Call Trace:
[ 170.652262] <IRQ>
[ 170.652262] irq_exit+0x6a/0xb0
[ 170.652262] smp_apic_timer_interrupt+0x74/0x130
[ 170.652262] apic_timer_interrupt+0xf/0x20
Then I check the stack of CPU91's sibling CPU19, and found it's on a spin lock.
[ 170.643678] CPU: 19 PID: 5385 Comm: uperf Kdump: loaded Not tainted 5.7.6+ #3
[ 170.643679] RIP: 0010:native_queued_spin_lock_slowpath+0x137/0x1e0
[ 170.643684] Call Trace:
[ 170.643684] <IRQ>
[ 170.643684] _raw_spin_lock+0x1b/0x20
[ 170.643685] tcp_delack_timer+0x2c/0xf0
[ 170.643685] ? tcp_delack_timer_handler+0x170/0x170
[ 170.643685] call_timer_fn+0x2d/0x130
[ 170.643685] run_timer_softirq+0x420/0x450
[ 170.643686] ? enqueue_hrtimer+0x39/0x90
[ 170.643686] ? __hrtimer_run_queues+0x138/0x290
[ 170.643686] __do_softirq+0xed/0x2f0
[ 170.643686] irq_exit+0xad/0xb0
[ 170.643686] smp_apic_timer_interrupt+0x74/0x130
[ 170.643687] apic_timer_interrupt+0xf/0x20
----snip----
[ 170.643738] entry_SYSCALL_64_after_hwframe+0x44/0xa9
So I guess the problem is,
CPU91 CPU19
(1)hold a bh_lock_sock(sk)
(2)<gets irq>
(3) <gets irq>
(4) irq_exit()
-> sched_core_irq_exit()
- not outermost, wait()
(5) invoke softirq
(6) acquire bh_lock_sock() and deadlock
(7) sched_core_irq_exit()
In case I understood anything wrong, I attached the full dmesg.
IMHO, can we let irq exit and wait before return user mode? I think we
can trust anything running in the kernel.
Thanks,
-Aubrey
View attachment "dmesg.txt" of type "text/plain" (216893 bytes)
Powered by blists - more mailing lists