[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fab615c-1d78-432b-9acc-01cafe393040@intel.com>
Date: Thu, 16 Oct 2025 00:07:37 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: Peter Zijlstra <peterz@...radead.org>, Madadi Vineeth Reddy
<vineethr@...ux.ibm.com>
CC: Tim Chen <tim.c.chen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, "K
Prateek Nayak" <kprateek.nayak@....com>, "Gautham R . Shenoy"
<gautham.shenoy@....com>, Vincent Guittot <vincent.guittot@...aro.org>, "Juri
Lelli" <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, "Mel
Gorman" <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Hillf
Danton" <hdanton@...a.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>,
"Jianyong Wu" <jianyong.wu@...look.com>, Yangyu Chen <cyy@...self.name>,
Tingyin Duan <tingyin.duan@...il.com>, Vern Hao <vernhao@...cent.com>, Len
Brown <len.brown@...el.com>, Aubrey Li <aubrey.li@...el.com>, Zhao Liu
<zhao1.liu@...el.com>, Chen Yu <yu.chen.surf@...il.com>, Adam Li
<adamli@...amperecomputing.com>, Tim Chen <tim.c.chen@...el.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 01/19] sched/fair: Add infrastructure for cache-aware load
balancing
On 10/15/2025 7:54 PM, Peter Zijlstra wrote:
> On Wed, Oct 15, 2025 at 12:42:48AM +0530, Madadi Vineeth Reddy wrote:
>>> +static void get_scan_cpumasks(cpumask_var_t cpus, int cache_cpu,
>>> + int pref_nid, int curr_cpu)
>>> +{
>>> +#ifdef CONFIG_NUMA_BALANCING
>>> + /* First honor the task's preferred node. */
>>> + if (pref_nid != NUMA_NO_NODE)
>>> + cpumask_or(cpus, cpus, cpumask_of_node(pref_nid));
>>> +#endif
>>> +
>>> + /* Next honor the task's cache CPU if it is not included. */
>>> + if (cache_cpu != -1 && !cpumask_test_cpu(cache_cpu, cpus))
>>> + cpumask_or(cpus, cpus,
>>> + cpumask_of_node(cpu_to_node(cache_cpu)));
>>> +
>>> + /*
>>> + * Lastly make sure that the task's current running node is
>>> + * considered.
>>> + */
>>> + if (!cpumask_test_cpu(curr_cpu, cpus))
>>> + cpumask_or(cpus, cpus, cpumask_of_node(cpu_to_node(curr_cpu)));
>>> +}
>>> +
>>> +static void __no_profile task_cache_work(struct callback_head *work)
>>> +{
>>> + struct task_struct *p = current;
>>> + struct mm_struct *mm = p->mm;
>>> + unsigned long m_a_occ = 0;
>>> + unsigned long curr_m_a_occ = 0;
>>> + int cpu, m_a_cpu = -1, cache_cpu,
>>> + pref_nid = NUMA_NO_NODE, curr_cpu;
>>> + cpumask_var_t cpus;
>>> +
>>> + WARN_ON_ONCE(work != &p->cache_work);
>>> +
>>> + work->next = work;
>>> +
>>> + if (p->flags & PF_EXITING)
>>> + return;
>>> +
>>> + if (!zalloc_cpumask_var(&cpus, GFP_KERNEL))
>>> + return;
>>> +
>>> + curr_cpu = task_cpu(p);
>>> + cache_cpu = mm->mm_sched_cpu;
>>> +#ifdef CONFIG_NUMA_BALANCING
>>> + if (static_branch_likely(&sched_numa_balancing))
>>> + pref_nid = p->numa_preferred_nid;
>>> +#endif
>>> +
>>> + scoped_guard (cpus_read_lock) {
>>> + get_scan_cpumasks(cpus, cache_cpu,
>>> + pref_nid, curr_cpu);
>>> +
>>
>> IIUC, `get_scan_cpumasks` ORs together the preferred NUMA node, cache CPU's node,
>> and current CPU's node. This could result in scanning multiple nodes, not preferring
>> the NUMA preferred node.
>
> So this used to be online_mask, and is now magically changed to this
> more limited mask.
>
> Could you split this change out and have it have a justification?
OK, we will do this and provide an explanation.
thanks,
Chenyu
Powered by blists - more mailing lists