[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250625191108.1646208-2-sshegde@linux.ibm.com>
Date: Thu, 26 Jun 2025 00:41:00 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, tglx@...utronix.de, yury.norov@...il.com,
maddy@...ux.ibm.com
Cc: sshegde@...ux.ibm.com, vschneid@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, kprateek.nayak@....com, huschle@...ux.ibm.com,
srikar@...ux.ibm.com, linux-kernel@...r.kernel.org,
christophe.leroy@...roup.eu, linuxppc-dev@...ts.ozlabs.org,
gregkh@...uxfoundation.org
Subject: [RFC v2 1/9] sched/docs: Document avoid_cpu_mask and avoid CPU concept
This describes what avoid CPU means and what scheduler aims to do
when a CPU is marked as avoid.
Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
---
Documentation/scheduler/sched-arch.rst | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/Documentation/scheduler/sched-arch.rst b/Documentation/scheduler/sched-arch.rst
index ed07efea7d02..d32755298fca 100644
--- a/Documentation/scheduler/sched-arch.rst
+++ b/Documentation/scheduler/sched-arch.rst
@@ -62,6 +62,31 @@ Your cpu_idle routines need to obey the following rules:
arch/x86/kernel/process.c has examples of both polling and
sleeping idle functions.
+CPU Avoid
+=========
+
+Under paravirt conditions it is possible to overcommit CPU resources.
+i.e sum of virtual CPU(vCPU) of all VM is greater than number of physical
+CPUs(pCPU). Under such conditions when all or many VM have high utilization,
+hypervisor won't be able to satisfy the requirement and has to context switch
+within or across VM. VM level context switch is more expensive compared to
+task context switch within the VM.
+
+In such cases it is better that VM's co-ordinate among themselves and ask for
+less CPU request by not using some of the vCPUs. Such vCPUs where workload
+can be avoided at the moment are called as "Avoid CPUs". Note that when the
+pCPU contention goes away, these vCPUs can be used again by the workload.
+
+Arch need to set/unset the vCPU as avoid in cpu_avoid_mask. When set, avoid
+the CPU and when unset, use it as usual.
+
+Scheduler will try to avoid those CPUs as much as it can.
+This is achived by
+1. Not selecting those CPU at wakeup.
+2. Push the task away from avoid CPU at tick.
+3. Not selecting avoid CPU at load balance.
+
+This works only for SCHED_RT and SCHED_NORMAL.
Possible arch/ problems
=======================
--
2.43.0
Powered by blists - more mailing lists