linux-kernel - Re: [PATCH] sched/topology: clear freecpu bit on detach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aBTCWPSXgu9cId6D@jlelli-thinkpadt14gen4.remote.csb>
Date: Fri, 2 May 2025 15:02:16 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Florian Fainelli <f.fainelli@...il.com>
Cc: Doug Berger <opendmb@...il.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Florian Fainelli <florian.fainelli@...adcom.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/topology: clear freecpu bit on detach

Hi,

On 29/04/25 10:15, Florian Fainelli wrote:
> 
> 
> On 4/22/2025 9:48 PM, Doug Berger wrote:
> > There is a hazard in the deadline scheduler where an offlined CPU
> > can have its free_cpus bit left set in the def_root_domain when
> > the schedutil cpufreq governor is used. This can allow a deadline
> > thread to be pushed to the runqueue of a powered down CPU which
> > breaks scheduling. The details can be found here:
> > https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com
> > 
> > The free_cpus mask is expected to be cleared by set_rq_offline();
> > however, the hazard occurs before the root domain is made online
> > during CPU hotplug so that function is not invoked for the CPU
> > that is being made active.
> > 
> > This commit works around the issue by ensuring the free_cpus bit
> > for a CPU is always cleared when the CPU is removed from a
> > root_domain. This likely makes the call of cpudl_clear_freecpu()
> > in rq_offline_dl() fully redundant, but I have not removed it
> > here because I am not certain of all flows.
> > 
> > It seems likely that a better solution is possible from someone
> > more familiar with the scheduler implementation, but this
> > approach is minimally invasive from someone who is not.
> > 
> > Signed-off-by: Doug Berger <opendmb@...il.com>
> > ---
> 
> FWIW, we were able to reproduce this with the attached hotplug.sh script
> which would just randomly hot plug/unplug CPUs (./hotplug.sh 4). Within a
> few hundred of iterations you could see the lock up occur, it's unclear why
> this has not been seen by more people.
> 
> Since this is not the first posting or attempt at fixing this bug [1] and we
> consider it to be a serious one, can this be reviewed/commented on/applied?
> Thanks!
> 
> [1]: https://lkml.org/lkml/2025/1/14/1687

So, going back to the initial report, the thing that makes me a bit
uncomfortable with the suggested change is the worry that it might be
plastering over a more fundamental issue. Not against it, though, and I
really appreciate Doug's analysis and proposed fixes!

Doug wrote:

"Initially, CPU0 and CPU1 are active and CPU2 and CPU3 have been
previously offlined so their runqueues are attached to the
def_root_domain.
1) A hot plug is initiated on CPU2.
2) The cpuhp/2 thread invokes the cpufreq governor driver during
   the CPUHP_AP_ONLINE_DYN step.
3) The sched util cpufreq governor creates the "sugov:2" thread to
   execute on CPU2 with the deadline scheduler.
4) The deadline scheduler clears the free_cpus mask for CPU2 within
   the def_root_domain when "sugov:2" is scheduled."

I wonder if it's OK to schedule sugov:2 on a CPU that didn't reach yet
complete online state. Peter, others, what do you think?

Thanks,
Juri