[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250627-rneri-fix-cas-clusters-v1-0-121ffb50bbc7@linux.intel.com>
Date: Fri, 27 Jun 2025 14:45:26 -0700
From: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
Tim C Chen <tim.c.chen@...ux.intel.com>, Barry Song <baohua@...nel.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Len Brown <lenb@...nel.org>,
ricardo.neri@...el.com, linux-kernel@...r.kernel.org,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Subject: [PATCH 0/4] sched: Fix cluster scheduling in the presence of
asymmetric capacity
Cluster scheduling balances load among clusters of CPUs sharing a resource
[1]. It was broken on Intel hybrid processors using asymmetric packing of
tasks. Tim fixed that [2]. It is broken again when combined with asymmetric
CPU capacity.
The diagram below shows a processor with big (B) and small (s) CPUs. Also,
small CPUs are grouped in cluster sharing mid-level cache. This topology is
common in Intel hybrid processors.
------ ------
| B | | B | ----------------- -----------------
| | | | | s | s | s | s | | s | s | s | s |
------ ------ ----------------- -----------------
| L2 | | L2 | | L2 | | L2 |
-------------------------------------------------------
| L3 |
-------------------------------------------------------
On a partially busy system (one with idle CPUs; busy CPUs have one task
each), scheduling for asymmetric capacity ensures that misfit tasks are
placed on the big CPUs. The remaining tasks, misfit or not, run on the
small CPUs. If CONFIG_SCHED_CLUSTER is enabled, these remaining tasks
should be evenly spread between the two small-CPU clusters.
This does not happen today because various checks in the load balancer
prevent a small CPU in one cluster from pulling tasks from another:
* A bug in update_sd_pick_busiest() causes it to not check for capacity
when preferring a fully_busy big CPU (which it cannot help) vs a has_
spare small-CPU cluster (which it can).
* Accounting misfit load in a group is pointless if the destination CPU
is equally a small CPU. Moreover, update_sd_pick_busiest() will not
pick such group as busiest anyway.
* Once a busiest group has been identified, sched_balance_find_src_rq()
will refuse to migrate tasks to CPUs of equal capacity.
* The SD_PREFER_SIBLING flag is removed from scheduling domains with
asymmetric capacity.
I address these issues in this series. Details are in the changelog of each
patch.
I tested these patches on an Alder Lake system with Hyper-Threading
disabled. I also tested with CONFIG_SCHED_CLUSTER=n to ensure that
processors without clusters continue to work.
[1]. https://lore.kernel.org/r/20210924085104.44806-1-21cnbao@gmail.com/
[2]. https://lore.kernel.org/r/cover.1688770494.git.tim.c.chen@linux.intel.com/
---
Ricardo Neri (4):
sched/fair: Always skip fully_busy higher-capacity groups for load balance
sched/fair: Ignore misfit load if the destination CPU cannot help
sched/fair: Allow load balancing between CPUs of equal capacity
sched/topology: Keep SD_PREFER_SIBLING for domains with clusters
kernel/sched/fair.c | 27 +++++++++++++++------------
kernel/sched/topology.c | 11 +++++++++--
2 files changed, 24 insertions(+), 14 deletions(-)
---
base-commit: e51a38e71974982abb3f2f16141763a1511f7a3f
change-id: 20250620-rneri-fix-cas-clusters-bb4287d1e152
Best regards,
--
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Powered by blists - more mailing lists