lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <atuz6en52e6jma7dz322dtgcs7zv4qlybzflb35eu5irvwbl6q@m5jwldrhxcwn>
Date: Sat, 18 Jan 2025 16:40:21 +0900
From: Koichiro Den <koichiro.den@...onical.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, peterz@...radead.org
Subject: Re: [PATCH] cpu/hotplug: disallow writing any state in atomic AP
 section to sysfs target

On Thu, Jan 16, 2025 at 02:21:14PM GMT, Thomas Gleixner wrote:
> On Fri, Dec 20 2024 at 23:15, Koichiro Den wrote:
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 34f1a09349fc..c877443f5888 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -2127,6 +2127,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> >  	[CPUHP_BP_KICK_AP] = {
> >  		.name			= "cpu:kick_ap",
> >  		.startup.single		= cpuhp_kick_ap_alive,
> > +		.cant_stop		= true,
> 
> Why? If it stops here, then no harm is done. The AP just waits for being
> released. It won't change the state as that's a seperate handshake
> mechanism.

Thanks for the feedback.
I see my commit message was poorly worded. Please see below.

> 
> >  	},
> >  
> >  	/*
> > @@ -2192,6 +2193,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> >  	 * state for synchronsization */
> >  	[CPUHP_AP_ONLINE] = {
> >  		.name			= "ap:online",
> > +		.cant_stop		= true,
> 
> Your change log is pretty unclear about the reason for this change.

Looking back, I agree it's unclear.

The reason why I set cant_stop for CPUHP_BP_KICK_AP is to prevent state
transitions, such as CPUHP_ONLINE -> CPUHP_BP_KICK_AP -> CPUHP_ONLINE, from
being stuck on its return trip. Kicking the AP needs to happen as idle task
is playing dead in this scenario.

I think it's inherently and unavoidably asymmetric because the atomic AP
states handle leading the idle task to dead, while kicking the AP occurs so
outside of the atomic AP states. It appears to me that just prohibiting
stops at CPUHP_BP_KICK_AP is the simplest solution.

My change log should have been something like this:

(before)
  [...]
  states to sysfs target. Additionally, set cant_stop to true for both
  CPUHP_BP_KICK_AP (when CONFIG_HOTPLUG_SPLIT_STARTUP=y) and
  CPUHP_AP_ONLINE since we do not automatically make the state machine
  proceed to the other end of the atomic states.

(after)
  [...]
  states to sysfs target. Additionally, set cant_stop to true for both
  CPUHP_BP_KICK_AP (when CONFIG_HOTPLUG_SPLIT_STARTUP=y) and
  CPUHP_AP_ONLINE due to the following reasons:
  * For CPUHP_BP_KICK_AP: To ensure transitions from a higher state can
    successfully return to a higher state, kicking the AP must take place.
  * For CPUHP_AP_ONLINE: we do not automatically make the state machine
    proceed to the other end of the atomic states.

Thanks,

-Koichiro Den

> 
> Thanks,
> 
>         tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ