[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a266076-b3dc-4a39-aac4-089e2ef77da3@gmx.de>
Date: Thu, 1 Feb 2024 17:41:10 +0100
From: Helge Deller <deller@....de>
To: Tejun Heo <tj@...nel.org>, Helge Deller <deller@...nel.org>
Cc: Lai Jiangshan <jiangshanlai@...il.com>, linux-kernel@...r.kernel.org,
linux-parisc@...r.kernel.org
Subject: Re: [PATCH][RFC] workqueue: Fix kernel panic on CPU hot-unplug
On 1/31/24 23:28, Tejun Heo wrote:
> On Wed, Jan 31, 2024 at 08:27:45PM +0100, Helge Deller wrote:
>> When hot-unplugging a 32-bit CPU on the parisc platform with
>> "chcpu -d 1", I get the following kernel panic. Adding a check
>> for !pwq prevents the panic.
>>
>> Kernel Fault: Code=26 (Data memory access rights trap) at addr 00000000
>> CPU: 1 PID: 21 Comm: cpuhp/1 Not tainted 6.8.0-rc1-32bit+ #1291
>> Hardware name: 9000/778/B160L
>>
>> IASQ: 00000000 00000000 IAOQ: 10446db4 10446db8
>> IIR: 0f80109c ISR: 00000000 IOR: 00000000
>> CPU: 1 CR30: 11dd1710 CR31: 00000000
>> IAOQ[0]: wq_update_pod+0x98/0x14c
>> IAOQ[1]: wq_update_pod+0x9c/0x14c
>> RP(r2): wq_update_pod+0x80/0x14c
>> Backtrace:
>> [<10448744>] workqueue_offline_cpu+0x1d4/0x1dc
>> [<10429db4>] cpuhp_invoke_callback+0xf8/0x200
>> [<1042a1d0>] cpuhp_thread_fun+0xb8/0x164
>> [<10452970>] smpboot_thread_fn+0x284/0x288
>> [<1044d8f4>] kthread+0x12c/0x13c
>> [<1040201c>] ret_from_kernel_thread+0x1c/0x24
>> Kernel panic - not syncing: Kernel Fault
>>
>> Signed-off-by: Helge Deller <deller@....de>
>>
>> ---
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 76e60faed892..dfeee7b7322c 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -4521,6 +4521,8 @@ static void wq_update_pod(struct workqueue_struct *wq, int cpu,
>> wq_calc_pod_cpumask(target_attrs, cpu, off_cpu);
>> pwq = rcu_dereference_protected(*per_cpu_ptr(wq->cpu_pwq, cpu),
>> lockdep_is_held(&wq_pool_mutex));
>> + if (!pwq)
>> + return;
>
> Hmm... I have a hard time imagining a scenario where some CPUs don't have
> pwq installed on wq->cpu_pwq. Can you please run `drgn
> tools/workqueue/wq_dump.py` before triggering the hotplug event and paste
> the output along with full dmesg?
I'm not sure if parisc is already fully supported with that tool, or
if I'm doing something wrong:
root@...ian:~# uname -a
Linux debian 6.8.0-rc1-32bit+ #1292 SMP PREEMPT Thu Feb 1 11:31:38 CET 2024 parisc GNU/Linux
root@...ian:~# drgn --main-symbols -s ./vmlinux ./wq_dump.py
Traceback (most recent call last):
File "/usr/bin/drgn", line 33, in <module>
sys.exit(load_entry_point('drgn==0.0.25', 'console_scripts', 'drgn')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/drgn/cli.py", line 301, in _main
runpy.run_path(script, init_globals={"prog": prog}, run_name="__main__")
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "./wq_dump.py", line 78, in <module>
worker_pool_idr = prog['worker_pool_idr']
~~~~^^^^^^^^^^^^^^^^^^^
KeyError: 'worker_pool_idr'
Maybe you have an idea? I'll check further, but otherwise it's probably
easier for me to add some printk() to the kernel function wq_update_pod()
and send that info?
Helge
Powered by blists - more mailing lists