lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <7359f994-8aaf-3cea-f5cf-c0d3929689d6@quicinc.com>
Date:   Tue, 18 Jan 2022 17:16:39 +0530
From:   Mukesh Ojha <quic_mojha@...cinc.com>
To:     lkml <linux-kernel@...r.kernel.org>
CC:     <paulmck@...nel.org>, Thomas Gleixner <tglx@...utronix.de>
Subject: synchronize_rcu_expedited gets stuck in hotplug path

Hi ,

We are facing one issue in hotplug test where cpuhp/2 gets stuck in 
below path [1] in
synchronize_rcu_expedited at state CPUHP_AP_ONLINE_DYN and it is not 
able to proceed.
We see wait_rcu_exp_gp() is queued to cpu2  and it looks like it did not 
get chance to
run as we see it as in pending state at cpu2 [2].

So, when exactly cpu2 gets available for scheduling in hotplug path, is 
it after
CPUHP_AP_ACTIVE?

It looks to be dead lock here. Can it be fixed by making 
wait_rcu_exp_gp() queued on another wq ?
or is it a wrong usage of synchronise_rcu in hotplug path?

[1]

=======================================================
Process: cpuhp/2, [affinity: 0x4] cpu: 2 pid: 24 start: 0xffffff87803e4a00
=====================================================
     Task name: cpuhp/2 [affinity: 0x4] pid: 24 cpu: 2 prio: 120 start: 
ffffff87803e4a00
     state: 0x2[D] exit_state: 0x0 stack base: 0xffffffc010160000
     Last_enqueued_ts:      59.022215498 Last_sleep_ts: 59.022922946
     Stack:
     [<ffffffe9f4074354>] __switch_to+0x248
     [<ffffffe9f5c02474>] __schedule+0x5b0
     [<ffffffe9f5c02b28>] schedule+0x80
     [<ffffffe9f42321a4>] synchronize_rcu_expedited+0x1c4
     [<ffffffe9f423b294>] synchronize_rcu+0x4c
     [<ffffffe9f6d04ab0>] waltgov_stop[sched_walt]+0x78
     [<ffffffe9f512fa28>] cpufreq_add_policy_cpu+0xc0
     [<ffffffe9f512e48c>] cpufreq_online[jt]+0x10f4
     [<ffffffe9f51323b8>] cpuhp_cpufreq_online+0x14
     [<ffffffe9f4128d3c>] cpuhp_invoke_callback+0x2f8
     [<ffffffe9f412c30c>] cpuhp_thread_fun+0x130
     [<ffffffe9f4187a58>] smpboot_thread_fn+0x180
     [<ffffffe9f417d98c>] kthread+0x150
     [<ffffffe9f4013918>] ret_to_user[jt]+0x0


[2]

CPU 2
pool 0
IDLE Workqueue worker: kworker/2:3 current_work: (None)
IDLE Workqueue worker: kworker/2:2 current_work: (None)
IDLE Workqueue worker: kworker/2:1 current_work: (None)
IDLE Workqueue worker: kworker/2:0 current_work: (None)
Pending entry: wait_rcu_exp_gp[jt]
Pending entry: lru_add_drain_per_cpu[jt]
Pending entry: wq_barrier_func[jt]

Thanks,
Mukesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ