[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251205160723.GG2528459@noisy.programming.kicks-ass.net>
Date: Fri, 5 Dec 2025 17:07:23 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Srikar Dronamraju <srikar@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
Ben Segall <bsegall@...gle.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...nel.org>, Juri Lelli <juri.lelli@...hat.com>,
K Prateek Nayak <kprateek.nayak@....com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
Mel Gorman <mgorman@...e.de>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Shrikanth Hegde <sshegde@...ux.ibm.com>,
Steven Rostedt <rostedt@...dmis.org>,
Swapnil Sapkal <swapnil.sapkal@....com>,
Thomas Huth <thuth@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
virtualization@...ts.linux.dev,
Yicong Yang <yangyicong@...ilicon.com>,
Ilya Leoshkevich <iii@...ux.ibm.com>
Subject: Re: [PATCH 08/17] sched/core: Implement CPU soft offline/online
On Thu, Dec 04, 2025 at 11:23:56PM +0530, Srikar Dronamraju wrote:
> Scheduler already supports CPU online/offline. However for cases where
> scheduler has to offline a CPU temporarily, the online/offline cost is
> too high. Hence here is an attempt to come-up with soft-offline that
> almost looks similar to offline without actually having to do the
> full-offline. Since CPUs are not to be used temporarily for a short
> duration, they will continue to be part of the CPU topology.
>
> In the soft-offline, CPU will be marked as inactive, i.e removed from
> the cpu_active_mask, CPUs capacity would be reduced and non-pinned tasks
> would be migrated out of the CPU's runqueue.
>
> Similarly when onlined, CPU will be remarked as active, i.e. added to
> cpu_active_mask, CPUs capacity would be restored.
>
> Soft-offline is almost similar as 1st step of offline except rebuilding
> the sched-domains. Since the other steps are not done including
> rebuilding the sched-domain, the overhead of soft-offline would be less
> compared to regular offline. A new cpumask is used to indicate
> soft-offline is in progress and hence skips rebuilding the
> sched-domains.
Note that your thing still very much includes the synchronize_rcu() that
a lot of the previous 'hotplug is too slow' crowd have complained about.
So I'm taking it that your steal time thing really isn't that 'fast'.
It might be good to mention the frequency at which you expect cores to
come and go with your setup.
Powered by blists - more mailing lists