linux-kernel - Re: [PATCH v2 17/23] sched/cache: Record the number of active threads per process for cache-aware scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <61b1242b-59a8-42ea-a6a5-813fa063f0fd@intel.com>
Date: Wed, 17 Dec 2025 20:51:50 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: Aaron Lu <ziqianlu@...edance.com>
CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, "K
 Prateek Nayak" <kprateek.nayak@....com>, "Gautham R . Shenoy"
	<gautham.shenoy@....com>, Vincent Guittot <vincent.guittot@...aro.org>, "Juri
 Lelli" <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, "Mel
 Gorman" <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Madadi
 Vineeth Reddy" <vineethr@...ux.ibm.com>, Hillf Danton <hdanton@...a.com>,
	Shrikanth Hegde <sshegde@...ux.ibm.com>, Jianyong Wu
	<jianyong.wu@...look.com>, Yangyu Chen <cyy@...self.name>, Tingyin Duan
	<tingyin.duan@...il.com>, Vern Hao <vernhao@...cent.com>, Vern Hao
	<haoxing990@...il.com>, Len Brown <len.brown@...el.com>, Aubrey Li
	<aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>, Chen Yu
	<yu.chen.surf@...il.com>, Adam Li <adamli@...amperecomputing.com>, Tim Chen
	<tim.c.chen@...el.com>, <linux-kernel@...r.kernel.org>, Tim Chen
	<tim.c.chen@...ux.intel.com>
Subject: Re: [PATCH v2 17/23] sched/cache: Record the number of active threads
 per process for cache-aware scheduling

On 12/17/2025 5:40 PM, Aaron Lu wrote:
> On Wed, Dec 03, 2025 at 03:07:36PM -0800, Tim Chen wrote:
>> @@ -1501,6 +1507,7 @@ static void __no_profile task_cache_work(struct callback_head *work)
>>   		mm->mm_sched_cpu = m_a_cpu;
>>   	}
>>   
>> +	update_avg(&mm->nr_running_avg, nr_running);
> 
> update_avg() doesn't appear to deal with small numbers well and can have
> an error as large as 7, e.g. when nr_running < 8, nr_running_avg will
> always be 0 and when nr_running >= 8 && < 16, nr_running_avg will be
> 1 - 8, etc.
> 
> AMD Genoa has 8 cores per LLC and this will break exceed_llc_nr() there.
> 

Ah, you are right, thanks for pointing this out, dividing by 8 would make
convergence slow for small LLC system. Maybe consider the number of Cores
in the LLC, the smaller the number is, the more we should honor the diff
between two invoking of update_avg()?

static inline void sched_cache_update_avg(u64 *avg, u64 sample)
{
	s64 diff = sample - *avg;
	u32 divisor = clamp_t(u32, nr_cores_llc/4, 2, 8);

	*avg += diff / divisor;
}

For <=8 cores per LLC, the divisor is 2,
for 16 cores per LLC, the divisor is 4,
for >=32 cores per LLC, the divisor is 8

Thanks,
Chenyu

>>   	free_cpumask_var(cpus);
>>   }