linux-kernel - Re: [PATCH] cpu/hotplug: ensure the starting section runs fully regardless of target

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <skljhhmj7q4hiito277wbbw5u7ygpebfhhx7rkt5k7qtbf6qby@w2vnygftea77>
Date: Mon, 9 Dec 2024 19:46:40 +0900
From: Koichiro Den <koichiro.den@...onical.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, peterz@...radead.org
Subject: Re: [PATCH] cpu/hotplug: ensure the starting section runs fully
 regardless of target

On Mon, Dec 09, 2024 at 01:19:43PM +0900, Koichiro Den wrote:
> On Mon, Dec 09, 2024 at 09:12:44AM +0900, Koichiro Den wrote:
> > On Sun, Dec 08, 2024 at 09:34:37PM +0100, Thomas Gleixner wrote:
> > > On Sat, Dec 07 2024 at 23:47, Koichiro Den wrote:
> > > >  static int take_cpu_down(void *_param)
> > > >  {
> > > >  	struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
> > > > -	enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
> > > >  	int err, cpu = smp_processor_id();
> > > >  
> > > >  	/* Ensure this CPU doesn't handle any more interrupts. */
> > > > @@ -1285,8 +1284,9 @@ static int take_cpu_down(void *_param)
> > > >  
> > > >  	/*
> > > >  	 * Invoke the former CPU_DYING callbacks. DYING must not fail!
> > > > +	 * Regardless of st->target, it must run through to CPUHP_AP_OFFLINE.
> > > >  	 */
> > > > -	cpuhp_invoke_callback_range_nofail(false, cpu, st, target);
> > > > +	cpuhp_invoke_callback_range_nofail(false, cpu, st, CPUHP_AP_OFFLINE);
> > > 
> > > This is really the wrong place. This want's to be enforced at the sysfs
> > > interface already and reject writes which are between AP_OFFLINE and
> > > AP_ONLINE.
> > >   
> > > It's utterly confusing to write a particular target and then magically
> > > end up at some other state.
> > 
> > Ok, I'll send v2. Thanks for the review.
> 
> Now I'm wondering whether we should go further..
> Because in PREPARE section, things are not fully symmetric, so
> there is a problem like an example below:
> 
>     E.g.
> 
>     (1) writing <some state in between> into 'target' and then (2) bringing
>     the the cpu fully online again by writing a large value leaves
>     hrtimer_cpu_base's 'online' field at 0 because hrtimers:prepare does not
>     re-run.
> 
>            - hrtimers:prepare (CPUHP_HRTIMERS_PREPARE)
>            - ...
>     ------ - <some state in between>
>      ^  :  - ...
>      :  :  - hrtimers:dying (CPUHP_AP_HRTIMERS_DYING)
>     (1)(2)
>      :  :
>      :  v
> 
> While I understand this is still a debug option, it seems to me that there are
> several approaches to consider here. I'm inclined toward (a).
> 
> (a). prohibit writing all halfway states in PREPARE+STARTING sections, i.e.
> 
>      --- a/kernel/cpu.c
>      +++ b/kernel/cpu.c
>      @@ -2759,7 +2759,8 @@ static ssize_t target_store(struct device *dev,
>                      return ret;
>      
>       #ifdef CONFIG_CPU_HOTPLUG_STATE_CONTROL
>      -       if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE)
>      +       if (target != CPUHP_OFFLINE || target > CPUHP_ONLINE ||
>      +           target < CPUHP_AP_OFFLINE_IDLE)
>                      return -EINVAL;
>       #else
>              if (target != CPUHP_OFFLINE && target != CPUHP_ONLINE)

Oops, sorry this was my mistake. it should be:

       --- a/kernel/cpu.c
       +++ b/kernel/cpu.c
       @@ -2759,7 +2759,8 @@ static ssize_t target_store(struct device *dev
                       return ret;
       
        #ifdef CONFIG_CPU_HOTPLUG_STATE_CONTROL
       -       if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE)
       +       if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE ||
       +           (target > CPUHP_OFFLINE && target < CPUHP_AP_OFFLINE_IDLE))
                       return -EINVAL;
        #else
               if (target != CPUHP_OFFLINE && target != CPUHP_ONLINE)

> 
> (b). make all fully symmetric. (I'm not sure whether it could be possible)
> (c). add all safety catch at sysfs interface. (For the above example, once
>      it goes down to <some state in between>, disallow to go up without
>      once going down to a state earlier than hrtimers:prepare beforehand.
>      I guess this would mess up source code unnecessarily though.)
> ...
> 
> Let me know what you think.
> 
> Thanks,
> 
> -Koichiro Den
> 
> 
> > 
> > -Koichiro Den
> > 
> > > 
> > > Thanks,
> > > 
> > >         tglx