lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Apr 2021 12:58:00 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Dongli Zhang <dongli.zhang@...cle.com>
Cc:     linux-kernel@...r.kernel.org, tglx@...utronix.de,
        peterz@...radead.org, mpe@...erman.id.au, paulmck@...nel.org,
        npiggin@...il.com, frederic@...nel.org, ethp@...com,
        joe.jin@...cle.com
Subject: Re: [PATCH v2 1/1] kernel/cpu: to track which CPUHP callback is
 failed

On 04/08/21 22:53, Dongli Zhang wrote:
> During bootup or cpu hotplug, the cpuhp_up_callbacks() or
> cpuhp_down_callbacks() call many CPUHP callbacks (e.g., perf, mm,
> workqueue, RCU, kvmclock and more) for each cpu to online/offline. It may
> roll back to its previous state if any of callbacks is failed. As a result,
> the user will not be able to know which callback is failed and usually the
> only symptom is cpu online/offline failure.
> 
> This patch is to print more debug log to help user narrow down where is the
> root cause.
> 
> Below is the example that how the patch helps narrow down the root cause
> for the issue fixed by commit d7eb79c6290c ("KVM: kvmclock: Fix vCPUs > 64
> can't be online/hotpluged").
> 
> We will have below dynamic debug log once we add
> dyndbg="file kernel/cpu.c +p" to kernel command line and when issue is
> reproduced.

You can also enable it at runtime

echo "file kernel/cpu.c +p" > /sys/kernel/debug/dynamic_debug/control

> 
> "CPUHP up callback failure (-12) for cpu 64 at kvmclock:setup_percpu (66)"
> 
> Cc: Joe Jin <joe.jin@...cle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@...cle.com>
> ---

I don't see the harm in adding the debug if some find it useful.

FWIW

Reviewed-by: Qais Yousef <qais.yousef@....com>

Cheers

--
Qais Yousef

> Changed since v1 RFC:
>   - use pr_debug() but not pr_err_once() (suggested by Qais Yousef)
>   - print log for cpuhp_down_callbacks() as well (suggested by Qais Yousef)
> 
>  kernel/cpu.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 1b6302ecbabe..bcd4dd7de9c3 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -621,6 +621,10 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
>  		st->state++;
>  		ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
>  		if (ret) {
> +			pr_debug("CPUHP up callback failure (%d) for cpu %u at %s (%d)\n",
> +				 ret, cpu, cpuhp_get_step(st->state)->name,
> +				 st->state);
> +
>  			if (can_rollback_cpu(st)) {
>  				st->target = prev_state;
>  				undo_cpu_up(cpu, st);
> @@ -990,6 +994,10 @@ static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
>  	for (; st->state > target; st->state--) {
>  		ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
>  		if (ret) {
> +			pr_debug("CPUHP down callback failure (%d) for cpu %u at %s (%d)\n",
> +				 ret, cpu, cpuhp_get_step(st->state)->name,
> +				 st->state);
> +
>  			st->target = prev_state;
>  			if (st->state < prev_state)
>  				undo_cpu_down(cpu, st);
> -- 
> 2.17.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ