linux-kernel - Re: [RFC v2 1/9] sched/docs: Document avoid_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250627002702.1942-1-hdanton@sina.com>
Date: Fri, 27 Jun 2025 08:27:01 +0800
From: Hillf Danton <hdanton@...a.com>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: peterz@...radead.org,
	kprateek.nayak@....com,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC v2 1/9] sched/docs: Document avoid_cpu_mask and avoid CPU concept

On Thu, 26 Jun 2025 20:16:36 +0530 Shrikanth Hegde wrote
> > On Thu, 26 Jun 2025 00:41:00 +0530 Shrikanth Hegde wrote
> >> This describes what avoid CPU means and what scheduler aims to do
> >> when a CPU is marked as avoid.
> >>
> >> Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
> >> ---
> >>   Documentation/scheduler/sched-arch.rst | 25 +++++++++++++++++++++++++
> >>   1 file changed, 25 insertions(+)
> >>
> >> diff --git a/Documentation/scheduler/sched-arch.rst b/Documentation/scheduler/sched-arch.rst
> >> index ed07efea7d02..d32755298fca 100644
> >> --- a/Documentation/scheduler/sched-arch.rst
> >> +++ b/Documentation/scheduler/sched-arch.rst
> >> @@ -62,6 +62,31 @@ Your cpu_idle routines need to obey the following rules:
> >>   arch/x86/kernel/process.c has examples of both polling and
> >>   sleeping idle functions.
> >>   
> >> +CPU Avoid
> >> +=========
> >> +
> >> +Under paravirt conditions it is possible to overcommit CPU resources.
> >> +i.e sum of virtual CPU(vCPU) of all VM is greater than number of physical
> >> +CPUs(pCPU). Under such conditions when all or many VM have high utilization,
> >> +hypervisor won't be able to satisfy the requirement and has to context switch
> >> +within or across VM. VM level context switch is more expensive compared to
> >> +task context switch within the VM.
> >> +
> > Sounds like VMs not well configured (or pCPUs not well partationed).
> 
> No. That's how VMs under paravirtulized case configured as i understand.
> Correct me if i am wrong.
> 
> On powerpc, we have Shared Processor Logical partitions (SPLPAR) which allows overcommit.
> When other LPAR(VM) are idle, by having overcommit one could get more work done. This allows one
> to configure more VMs too. The said issue happens only when every/most VMs ask for
> CPU at the same time.
> 
After putting virtualization aside, lets see a simpler case where more
than 1024 apps are bound to a single (ppc having 4 CPUs for instance) CPU,
what can we do wrt app responsibility in kernel? Nothing because
resource/budget is never enough without sane config.