linux-kernel - Re: [PATCH v16 2/8] cpu/hotplug: Reset task stack state in _cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZBrhKERLxvklAhiP@FVFF77S0Q05N>
Date:   Wed, 22 Mar 2023 11:06:16 +0000
From:   Mark Rutland <mark.rutland@....com>
To:     Usama Arif <usama.arif@...edance.com>
Cc:     dwmw2@...radead.org, tglx@...utronix.de, kim.phillips@....com,
        brgerst@...il.com, piotrgorski@...hyos.org,
        oleksandr@...alenko.name, arjan@...ux.intel.com, mingo@...hat.com,
        bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
        x86@...nel.org, pbonzini@...hat.com, paulmck@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
        thomas.lendacky@....com, seanjc@...gle.com, pmenzel@...gen.mpg.de,
        fam.zheng@...edance.com, punit.agrawal@...edance.com,
        simon.evans@...edance.com, liangma@...ngbit.com,
        gpiccoli@...lia.com, David Woodhouse <dwmw@...zon.co.uk>
Subject: Re: [PATCH v16 2/8] cpu/hotplug: Reset task stack state in _cpu_up()

On Tue, Mar 21, 2023 at 07:40:02PM +0000, Usama Arif wrote:
> From: David Woodhouse <dwmw@...zon.co.uk>
> 
> Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
> ensured that the shadow call stack and KASAN poisoning were removed from
> a CPU's stack each time that CPU is brought up, not just once.
> 
> This is not incorrect. However, with parallel bringup, an architecture
> may obtain the idle thread for a new CPU from a pre-bringup stage, by
> calling idle_thread_get() for itself. This would mean that the cleanup
> in bringup_cpu() would be too late.
> 
> Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
> which already ensures that the new CPU's stack is available, purely to
> allow for early failure. This occurs when the CPU to be brought up is
> in the CPUHP_OFFLINE state, which should correctly do the cleanup any
> time the CPU has been taken down to the point where such is needed.
> 
> Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>

This all sounds fine to me, and the patch itself looks good.

I built an arm64 kernel with the first three patches from this series applied
atop v6.3-rc3, with defconfig + CONFIG_SHADOW_CALL_STACK=y +
CONFIG_KASAN_INLINE=y + CONFIG_KASAN_STACK=y. I then hotplugged a cpu with:

  while true; do
    echo 0 > /sys/devices/system/cpu/cpu1/online;
    echo 1 > /sys/devices/system/cpu/cpu1/online;
  done

... and that was perfectly happy to run for minutes with no unexpected failures.

To make sure I wasn't avoiding issues by chance, I also tried with each of
scs_task_reset() and kasan_unpoison_task_stack() commented out. With
scs_task_reset() commented out, cpu re-onlining fails after a few iterations,
and with kasan_unpoison_task_stack() commented out I get a KASAN splat upon the
first re-onlining. So that all looks good.

FWIW, for the first three patches:

Reviewed-by: Mark Rutland <mark.rutland@....com>
Tested-by: Mark Rutland <mark.rutland@....com> [arm64]

Mark.

> ---
>  kernel/cpu.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 6c0a92ca6bb5..43e0a77f21e8 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
>  	struct task_struct *idle = idle_thread_get(cpu);
>  	int ret;
>  
> -	/*
> -	 * Reset stale stack state from the last time this CPU was online.
> -	 */
> -	scs_task_reset(idle);
> -	kasan_unpoison_task_stack(idle);
> -
>  	/*
>  	 * Some architectures have to walk the irq descriptors to
>  	 * setup the vector space for the cpu which comes online.
> @@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
>  			ret = PTR_ERR(idle);
>  			goto out;
>  		}
> +
> +		/*
> +		 * Reset stale stack state from the last time this CPU was online.
> +		 */
> +		scs_task_reset(idle);
> +		kasan_unpoison_task_stack(idle);
>  	}
>  
>  	cpuhp_tasks_frozen = tasks_frozen;
> -- 
> 2.25.1
>