lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 8 Oct 2017 16:35:40 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>,
        James Morse <james.morse@....com>,
        AKASHI Takahiro <takahiro.akashi@...aro.org>,
        Hoeun Ryu <hoeun.ryu@...il.com>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Mathieu Poirier <mathieu.poirier@...aro.org>
Subject: Re: [PATCH] arm64: kdump: Avoid to power off nonpanic CPUs

Hi Leo,

On Sun, Oct 08, 2017 at 10:12:46PM +0800, Leo Yan wrote:
> commit a88ce63b642c ("arm64: kexec: have own crash_smp_send_stop() for
> crash dump for nonpanic cores") introduces ARM64 architecture function
> crash_smp_send_stop() to replace the weak function, this results in
> the nonpanic CPUs to be hot-plugged out and CPUs are placed into low
> power state on ARM64 platforms with the flow:
> 
>   Panic CPU:
>     machine_crash_shutdown()
>       crash_smp_send_stop()
> 	smp_cross_call(&mask, IPI_CPU_CRASH_STOP)
> 
>   Nonpanic CPUs:
>     handle_IPI()
>       ipi_cpu_crash_stop()
>         cpu_ops[cpu]->cpu_die()
> 
> The upper patch has no issue if enabled crash dump only; but if enabled
> crash dump and Coresight debug module for panic dumping at the meantime,
> nonpanic CPUs are powered off in crash dump flow, 

We want to turn secondary CPUs off if at all possible, since we want to prevent
issues resulting from asynchronous behaviour (e.g. TLB/cache fetches) that
could result in subsequent problems (e.g. if bad page tables resulted in page
table walks to MMIO devices).

So we *really* want this behaviour in the general case.

> later this may introduce conflicts with the Coresight debug module because
> Coresight debug registers dumping requires the CPU must be powered on for
> some platforms (e.g. Hi6220 on Hikey board). If we cannot keep the CPUs
> powered on, we can see the hardware lockup issue when access Coresight debug
> registers.

Just to check I understand, the coresight debug module is being invoked as a
panic notifier in the current kernel, right?

> To fix this issue, this commit removes CPU hotplug operation in func
> crash_smp_send_stop() and let CPUs to run into WFE/WFI states so CPUs
> can still be powered on after crash dump. This finally is more safe
> for Coresight debug module to dump registers and avoid hardware lockup.
> 
> Cc: James Morse <james.morse@....com>
> Cc: Will Deacon <will.deacon@....com>
> Cc: Catalin Marinas <catalin.marinas@....com>
> Cc: Mathieu Poirier <mathieu.poirier@...aro.org>
> Signed-off-by: Leo Yan <leo.yan@...aro.org>
> ---
>  arch/arm64/kernel/smp.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 9f7195a..a65e68b 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -856,12 +856,6 @@ static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
>  
>  	local_irq_disable();
>  
> -#ifdef CONFIG_HOTPLUG_CPU
> -	if (cpu_ops[cpu]->cpu_die)
> -		cpu_ops[cpu]->cpu_die(cpu);
> -#endif
> -

If it's really necessary to keep secondary CPUs online, please limit that to
the case where the coresight debug module is being used.

IIRC there were similar interactions with cpuidle, and I don't see why hotplug
should be any different.

Thanks,
Mark.

Powered by blists - more mailing lists