linux-kernel - Re: [RFC patch v3 01/20] sched: Cache aware load-balancing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <75e763df63fdddc77fcb2a02bfc3b94eb22aadb2.camel@linux.intel.com>
Date: Thu, 26 Jun 2025 17:10:24 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: "Chen, Yu C" <yu.c.chen@...el.com>, Jianyong Wu <jianyong.wu@...look.com>
Cc: Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann
 <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, Ben
 Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, Valentin
 Schneider <vschneid@...hat.com>, Tim Chen <tim.c.chen@...el.com>, Vincent
 Guittot <vincent.guittot@...aro.org>, Libo Chen <libo.chen@...cle.com>,
 Abel Wu <wuyun.abel@...edance.com>, Madadi Vineeth Reddy
 <vineethr@...ux.ibm.com>,  Hillf Danton <hdanton@...a.com>, Len Brown
 <len.brown@...el.com>, linux-kernel@...r.kernel.org, "Gautham R . Shenoy"
 <gautham.shenoy@....com>, Ingo Molnar <mingo@...hat.com>, K Prateek Nayak
 <kprateek.nayak@....com>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC patch v3 01/20] sched: Cache aware load-balancing

On Thu, 2025-06-26 at 21:32 +0800, Chen, Yu C wrote:
> 
> > 
> > This task work may take a long time for the system with large number 
> > cpus which increacing the delay for process back to userspace. It may be 
> > the reason that schbench benchmark regressed so much.
> > 
> 
> Thanks for the insight Jianyong, yes, the scan on all online CPUs would
> be costly.
> 
> > To avoid searching the whole system, what about just searching the 
> > preferred numa node provided by numa balancing if there is one. If not, 
> > then fallback to search the whole system or just search the numa node 
> > where the main process locates as there is a high probability it 
> > contains the preferred LLC. In other words, we can opt for a suboptimal 
> > LLC location to prioritize speed.
> > 
> > WDYT?
> > 
> This is a good idea. Previously, Tim had a version that dealt with a
> similar scenario, which only scanned the CPUs within p's preferred node.

Yes, we were also thinking along the line of looking only at the preferred
node.

>   However, it seems to cause bouncing of the mm->mm_sched_cpu because we
> set a 2X threshold for switching the mm->mm_sched_cpu in patch 5. If the
> old mm_sched_cpu is not in p's current preferred node, last_m_a_occ is
> always 0, which makes the switching of mm->mm_sched_cpu always succeed
> due to the condition if (m_a_occ > (2 * last_m_a_occ)). 
> 
There were some regressions on schbench during out tests and preferred
LLC bounces switches a lot with preferred node as mentioned by
Chen Yu.  For schbench, there's really not much NUMA data and preferred
node bounces around. We'll have to figure out the right thing
to do if preferred node changes and preferred LLC falls outside the
preferred node.

Tim

> Anyway, since it
> is a software issue, we can find a way to address it.
> 
> Maybe we also following Abel's suggestion that only one thread of
> the process is allowed to perform the statistic calculation, this
> could minimal the negative impact to the whole process.
>