[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c332c5b-83d9-465c-b02f-6648af9a9fae@os.amperecomputing.com>
Date: Mon, 29 Sep 2025 18:43:27 +0800
From: Adam Li <adamli@...amperecomputing.com>
To: "Chen, Yu C" <yu.c.chen@...el.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
Libo Chen <libo.chen@...cle.com>,
Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
Hillf Danton <hdanton@...a.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>,
Jianyong Wu <jianyong.wu@...look.com>, Yangyu Chen <cyy@...self.name>,
Tingyin Duan <tingyin.duan@...il.com>, Vern Hao <vernhao@...cent.com>,
Len Brown <len.brown@...el.com>, Tim Chen <tim.c.chen@...ux.intel.com>,
Aubrey Li <aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>,
Chen Yu <yu.chen.surf@...il.com>, linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
K Prateek Nayak <kprateek.nayak@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [RFC PATCH v4 08/28] sched: Set up LLC indexing
On 9/26/2025 9:51 PM, Chen, Yu C wrote:
> Hi Adam,
>
> On 9/26/2025 2:14 PM, Adam Li wrote:
>> Hi Chen Yu,
>>
>> I tested the patch set on AmpereOne CPU with 192 cores.
>> With certain firmware setting, each core has its own L1/L2 cache.
>> But *no* cores share LLC (L3). So *no* schedule domain
>> has flag 'SD_SHARE_LLC'.
>>
>
> Good catch! And many thanks for your detailed testing and
> analysis.
>
> Is this issue triggered with CONFIG_SCHED_CLUSTER disabled?
>
Yes. With CONFIG_SCHED_CLUSTER enabled this issue will
not be triggered. The maximum sd_llc_idx will be less than MAX_LLC(64)
since we have 24 (192/8) cluster domains.
>> With this topology:
>> per_cpu(sd_llc_id, cpu) is actually the cpu id (0-191).
>>
>> And kernel bug will be triggered at:
>> 'BUG_ON(idx > MAX_LLC)'
>>
>
> Yes, the sd_llc_idx thing is a bit tricky - we want to use it to
> index into the static array struct sg_lb_stat.nr_pref_llc, and
> we have to limit its range. A better approach would be to
> dynamically allocate the buffer, so we could get rid of the
> 'idx > MAX_LLC' check, but that might complicate the code.
>
>> Please see details bellow.
>>
>> The bug will disappear if setting 'MAX_LLC' to 192.
>> But I think we might disable CAS(cache aware scheduling)
>> if no domain has 'SD_SHARE_LLC'.
>>
>
> I agree with you. Simply disabling cache-aware scheduling
> if there is no SD_SHARE_LLC would be simpler.
>
>> On 8/9/2025 1:03 PM, Chen Yu wrote:
>> A draft patch like bellow can fix the kernel BUG:
>> 1) Do not call update_llc_idx() if domain has no SD_SHARE_LLC
>> 2) Disable CAS if domain has no SD_SHARE_LLC
>>
>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>> index 8483c02b4d28..cde9b6cdb1de 100644
>> --- a/kernel/sched/topology.c
>> +++ b/kernel/sched/topology.c
>> @@ -704,7 +704,8 @@ static void update_top_cache_domain(int cpu)
>> per_cpu(sd_llc_size, cpu) = size;
>> per_cpu(sd_llc_id, cpu) = id;
>> rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds);
>> - update_llc_idx(cpu);
>> + if (sd)
>> + update_llc_idx(cpu);
>>
>
> OK, that make sense.
>
>> sd = lowest_flag_domain(cpu, SD_CLUSTER);
>> if (sd)
>> @@ -2476,6 +2477,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>> int i, ret = -ENOMEM;
>> bool has_asym = false;
>> bool has_cluster = false;
>> + bool has_llc = false;
>> bool llc_has_parent_sd = false;
>> unsigned int multi_llcs_node = 1;
>>
>> @@ -2621,6 +2623,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>>
>> if (lowest_flag_domain(i, SD_CLUSTER))
>> has_cluster = true;
>> +
>> + if (highest_flag_domain(i, SD_SHARE_LLC))
>> + has_llc = true;
>> }
>> rcu_read_unlock();
>>
>> @@ -2631,7 +2636,8 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>> static_branch_inc_cpuslocked(&sched_cluster_active);
>>
>> #ifdef CONFIG_SCHED_CACHE
>> - if (llc_has_parent_sd && multi_llcs_node && !sched_asym_cpucap_active())
>> + if (has_llc && llc_has_parent_sd && multi_llcs_node &&
>
> multi_llcs_node will be false if there is no SD_SHARE_LLC domain on the
> platform, so I suppose we don’t have to introduce has_llc?
> multi_llcs is set to true iff there are more than 1 SD_SHARE_LLC domains under its
> SD_SHARE_LLC parent domain.
>
If there is *no* SD_SHARE_LLC domain, my test shows 'multi_llcs_node' is still 1 (true).
Looks it is because the default value of 'multi_llcs_node' is 1.
build_sched_domains():
unsigned int multi_llcs_node = 1;
And this condition is always false since we have no SD_SHARE_LLC domain,
therefore 'multi_llcs_node' will not be changed:
if (!(sd->flags & SD_SHARE_LLC) && child &&
(child->flags & SD_SHARE_LLC))
Thanks,
-adam
Powered by blists - more mailing lists