[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201123185338.GA2263110@elver.google.com>
Date: Mon, 23 Nov 2020 19:53:38 +0100
From: Marco Elver <elver@...gle.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Anders Roxell <anders.roxell@...aro.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Alexander Potapenko <glider@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Jann Horn <jannh@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
kasan-dev <kasan-dev@...glegroups.com>, rcu@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [PATCH] kfence: Avoid stalling work queue task without
allocations
On Mon, Nov 23, 2020 at 11:36AM -0500, Steven Rostedt wrote:
> On Mon, 23 Nov 2020 11:28:12 -0500
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > I noticed:
> >
> >
> > [ 237.650900] enabling event benchmark_event
> >
> > In both traces. Could you disable CONFIG_TRACEPOINT_BENCHMARK and see if
> > the issue goes away. That event kicks off a thread that spins in a tight
> > loop for some time and could possibly cause some issues.
> >
> > It still shouldn't break things, we can narrow it down if it is the culprit.
>
> And it probably is the issue because that thread will never sleep! It runs
> a loop of:
>
>
> static int benchmark_event_kthread(void *arg)
> {
> /* sleep a bit to make sure the tracepoint gets activated */
> msleep(100);
>
> while (!kthread_should_stop()) {
>
> trace_do_benchmark();
>
> /*
> * We don't go to sleep, but let others run as well.
> * This is basically a "yield()" to let any task that
> * wants to run, schedule in, but if the CPU is idle,
> * we'll keep burning cycles.
> *
> * Note the tasks_rcu_qs() version of cond_resched() will
> * notify synchronize_rcu_tasks() that this thread has
> * passed a quiescent state for rcu_tasks. Otherwise
> * this thread will never voluntarily schedule which would
> * block synchronize_rcu_tasks() indefinitely.
> */
> cond_resched_tasks_rcu_qs();
> }
>
> return 0;
> }
>
>
> Did something change, where that "cond_resched_tasks_rcu_qs()" doesn't let
> things progress on ARM64?
>
> I noticed that you have PREEMPT enabled so this will only be preempted when
> its schedule time runs out and something else wants to run. How would that
> affect other threads?
Bad news: It still crashes with CONFIG_TRACEPOINT_BENCHMARK=n :-/
| [ 118.146638] Testing all events: OK
| [ 169.829624] Running tests again, along with the function tracer
| [ 169.843216] Running tests on all trace events:
| [ 169.855714] Testing all events:
| [ 173.025250] hrtimer: interrupt took 10544064 ns
| [ 253.206548] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 10s!
| [ 253.314369] Showing busy workqueues and worker pools:
| [ 253.324289] workqueue events: flags=0x0
| [ 253.349488] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
| [ 253.362043] pending: vmstat_shepherd
| [ 253.380223] workqueue events_power_efficient: flags=0x82
| [ 253.417256] pwq 2: cpus=0 flags=0x4 nice=0 active=1/256 refcnt=3
| [ 253.428133] in-flight: 61:check_lifetime
| [ 253.446077] pool 2: cpus=0 flags=0x4 nice=0 hung=5s workers=4 idle: 101 99 7
| [ 444.944487] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 10s!
| [ 445.034039] Showing busy workqueues and worker pools:
| [ 445.050842] workqueue events: flags=0x0
| [ 445.056803] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
| [ 445.067830] pending: vmstat_shepherd
| [ 445.080635] pool 2: cpus=0 flags=0x5 nice=0 hung=11s workers=2 manager: 61 idle: 101
| [ 462.719775] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
| [ 462.726014] (detected by 0, t=3752 jiffies, g=2405, q=4)
| [ 462.728031] rcu: All QSes seen, last rcu_preempt kthread activity 3211 (4295005085-4295001874), jiffies_till_next_fqs=1, root ->qsmask 0x0
| [ 462.739092] rcu: rcu_preempt kthread starved for 3211 jiffies! g2405 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
| [ 462.745093] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
| [ 462.750782] rcu: RCU grace-period kthread stack dump:
| [ 462.755426] task:rcu_preempt state:R running task stack: 0 pid: 10 ppid: 2 flags:0x00000428
| [ 462.764703] Call trace:
| [ 462.768560] __switch_to+0x148/0x1f0
| [ 462.772700] __schedule+0x2dc/0x9a8
| [ 462.776802] preempt_schedule_common+0x4c/0x1a8
| [ 462.781214] preempt_schedule+0x38/0x40
| [ 462.785430] _raw_spin_unlock_irqrestore+0x90/0xa0
| [ 462.789928] finish_swait+0x64/0x88
| [ 462.794055] rcu_gp_kthread+0x448/0x1298
| [ 462.798276] kthread+0x13c/0x188
| [ 462.802302] ret_from_fork+0x10/0x34
| [ 462.806485] rcu: Stack dump where RCU GP kthread last ran:
| [ 462.811098] Task dump for CPU 0:
| [ 462.815033] task:kcompactd0 state:S stack: 0 pid: 26 ppid: 2 flags:0x00000428
| [ 462.822011] Call trace:
| [ 462.825784] dump_backtrace+0x0/0x240
| [ 462.829908] show_stack+0x34/0x88
| [ 462.833915] sched_show_task+0x208/0x230
| [ 462.838140] dump_cpu_task+0x4c/0x5c
| [ 462.842246] rcu_check_gp_kthread_starvation+0x240/0x388
| [ 462.846896] rcu_sched_clock_irq+0x14f4/0x17a0
| [ 462.851284] update_process_times+0x6c/0xb8
| [ 462.855578] tick_sched_handle.isra.0+0x58/0x88
| [ 462.859991] tick_sched_timer+0x68/0xe0
| [ 462.864175] __hrtimer_run_queues+0x288/0x730
| [ 462.868582] hrtimer_interrupt+0x114/0x288
| [ 462.872855] arch_timer_handler_virt+0x50/0x70
| [ 462.877243] handle_percpu_devid_irq+0x104/0x4c0
| [ 462.881653] generic_handle_irq+0x54/0x78
| [ 462.885842] __handle_domain_irq+0xac/0x130
| [ 462.890101] gic_handle_irq+0x70/0x108
| [ 462.894233] el1_irq+0xc0/0x180
| [ 462.898224] _raw_spin_unlock_irq+0x50/0x98
| [ 462.902481] finish_task_switch+0xb4/0x398
| [ 462.906768] __schedule+0x2e0/0x9a8
| [ 462.910852] preempt_schedule_notrace+0x70/0x1c0
| [ 462.915285] ftrace_ops_list_func+0x10c/0x258
| [ 462.919609] ftrace_graph_call+0x0/0x4
| [ 462.923792] kthread_should_stop+0x8/0x70
| [ 462.928038] kcompactd+0x314/0x4b8
| [ 462.932105] kthread+0x13c/0x188
| [ 462.936096] ret_from_fork+0x10/0x34
| [ 462.952198]
| [ 462.954584] ================================
| [ 462.957671] WARNING: inconsistent lock state
| [ 462.960977] 5.10.0-rc4-next-20201119-00004-g77838ee21ff6 #22 Not tainted
| [ 462.964810] --------------------------------
| [ 462.967939] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
| [ 462.971578] kcompactd0/26 [HC0[0]:SC0[0]:HE0:SE1] takes:
| [ 462.975088] ffffcb9ccd9f47d8 (rcu_node_0){?.-.}-{2:2}, at: rcu_sched_clock_irq+0x7c0/0x17a0
| [ 462.983339] {IN-HARDIRQ-W} state was registered at:
| [ 462.986815] __lock_acquire+0xae8/0x1ac8
| [ 462.989968] lock_acquire+0x268/0x508
| [ 462.993015] _raw_spin_lock_irqsave+0x78/0x14c
| [ 462.996267] rcu_sched_clock_irq+0x7c0/0x17a0
| [ 462.999496] update_process_times+0x6c/0xb8
| [ 463.002663] tick_sched_handle.isra.0+0x58/0x88
| [ 463.005926] tick_sched_timer+0x68/0xe0
| [ 463.009016] __hrtimer_run_queues+0x288/0x730
| [ 463.012245] hrtimer_interrupt+0x114/0x288
| [ 463.015401] arch_timer_handler_virt+0x50/0x70
| [ 463.018666] handle_percpu_devid_irq+0x104/0x4c0
| [ 463.021974] generic_handle_irq+0x54/0x78
| [ 463.025112] __handle_domain_irq+0xac/0x130
| [ 463.028279] gic_handle_irq+0x70/0x108
| [ 463.031320] el1_irq+0xc0/0x180
| [ 463.034196] _raw_spin_unlock_irq+0x50/0x98
| [ 463.037365] finish_task_switch+0xb4/0x398
| [ 463.040505] __schedule+0x2e0/0x9a8
| [ 463.043472] preempt_schedule_notrace+0x70/0x1c0
| [ 463.046777] ftrace_ops_list_func+0x10c/0x258
| [ 463.049998] ftrace_graph_call+0x0/0x4
| [ 463.053078] kthread_should_stop+0x8/0x70
| [ 463.056221] kcompactd+0x314/0x4b8
| [ 463.059184] kthread+0x13c/0x188
| [ 463.062074] ret_from_fork+0x10/0x34
| [ 463.065079] irq event stamp: 40276
| [ 463.068061] hardirqs last enabled at (40275): [<ffffcb9ccc826168>] _raw_spin_unlock_irq+0x48/0x98
| [ 463.072691] hardirqs last disabled at (40276): [<ffffcb9ccb4123c0>] el1_irq+0x80/0x180
| [ 463.076954] softirqs last enabled at (40006): [<ffffcb9ccb410b80>] __do_softirq+0x630/0x6b4
| [ 463.081364] softirqs last disabled at (40003): [<ffffcb9ccb4c623c>] irq_exit+0x1cc/0x1e0
| [ 463.085574]
| [ 463.085574] other info that might help us debug this:
| [ 463.089361] Possible unsafe locking scenario:
| [ 463.089361]
| [ 463.092907] CPU0
| [ 463.095454] ----
| [ 463.097991] lock(rcu_node_0);
| [ 463.102493] <Interrupt>
| [ 463.105092] lock(rcu_node_0);
| [ 463.109675]
| [ 463.109675] *** DEADLOCK ***
| [ 463.109675]
| [ 463.113299] 1 lock held by kcompactd0/26:
| [ 463.116335] #0: ffffcb9ccd9f47d8 (rcu_node_0){?.-.}-{2:2}, at: rcu_sched_clock_irq+0x7c0/0x17a0
| [ 463.125315]
| [ 463.125315] stack backtrace:
| [ 463.128700] CPU: 0 PID: 26 Comm: kcompactd0 Not tainted 5.10.0-rc4-next-20201119-00004-g77838ee21ff6 #22
| [ 463.133364] Hardware name: linux,dummy-virt (DT)
| [ 463.136648] Call trace:
| [ 463.139294] dump_backtrace+0x0/0x240
| [ 463.142270] show_stack+0x34/0x88
| [ 463.145148] dump_stack+0x140/0x1bc
| [ 463.148098] print_usage_bug+0x2a0/0x2f0
| [ 463.151167] mark_lock.part.0+0x438/0x4e8
| [ 463.154249] mark_held_locks+0x54/0x90
| [ 463.157269] lockdep_hardirqs_on_prepare+0xe0/0x290
| [ 463.160613] trace_hardirqs_on+0x90/0x370
| [ 463.163699] el1_irq+0xdc/0x180
| [ 463.166559] _raw_spin_unlock_irq+0x50/0x98
| [ 463.169717] finish_task_switch+0xb4/0x398
| [ 463.172841] __schedule+0x2e0/0x9a8
| [ 463.175812] preempt_schedule_notrace+0x70/0x1c0
| [ 463.179109] ftrace_ops_list_func+0x10c/0x258
| [ 463.182317] ftrace_graph_call+0x0/0x4
| [ 463.185338] kthread_should_stop+0x8/0x70
| [ 463.188421] kcompactd+0x314/0x4b8
| [ 463.191349] kthread+0x13c/0x188
| [ 463.194230] ret_from_fork+0x10/0x34
| [ 463.199912] BUG: scheduling while atomic: kcompactd0/26/0x00000002
| [ 463.211543] INFO: lockdep is turned off.
| [ 463.219382] Modules linked in:
| [ 463.231386] Preemption disabled at:
| [ 463.235427] [<ffffcb9ccb6233dc>] ftrace_ops_list_func+0x10c/0x258
| [ 463.250883] CPU: 0 PID: 26 Comm: kcompactd0 Not tainted 5.10.0-rc4-next-20201119-00004-g77838ee21ff6 #22
| [ 463.255542] Hardware name: linux,dummy-virt (DT)
| [ 463.258751] Call trace:
| [ 463.261423] dump_backtrace+0x0/0x240
| [ 463.264426] show_stack+0x34/0x88
| [ 463.267350] dump_stack+0x140/0x1bc
| [ 463.270308] __schedule_bug+0xcc/0xe0
| [ 463.273317] __schedule+0x888/0x9a8
| [ 463.276295] preempt_schedule_notrace+0x70/0x1c0
| [ 463.279608] ftrace_ops_list_func+0x10c/0x258
| [ 463.282809] ftrace_graph_call+0x0/0x4
| [ 463.285835] kthread_should_stop+0x8/0x70
| [ 463.288933] kcompactd+0x314/0x4b8
| [ 463.291871] kthread+0x13c/0x188
| [ 463.294746] ret_from_fork+0x10/0x34
In case you're interested in reproducing, the qemu cmdline I use here is
(config attached, disk image not required):
$ qemu-system-aarch64 -kernel $KERNEL_WORKTREE/arch/arm64/boot/Image \
-append "console=ttyAMA0 root=/dev/sda debug earlycon earlyprintk=serial workqueue.watchdog_thresh=10" \
-nographic -smp 1 -machine virt -cpu cortex-a57 -m 2G
Thanks,
-- Marco
View attachment ".config" of type "text/plain" (231865 bytes)
Powered by blists - more mailing lists