[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SL2PR06MB30825F36A5963C6A05A39F9DBDE99@SL2PR06MB3082.apcprd06.prod.outlook.com>
Date: Fri, 8 Apr 2022 02:07:27 +0000
From: 王擎 <wangqing@...o.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
Sudeep Holla <sudeep.holla@....com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: wangqing <11112896@...tel.com>
Subject: [PATCH] arch_topology: support parsing cache topology from DT
>
>[...]
>
>> +void init_cpu_cache_topology(void)
>> +{
>> + struct device_node *node_cpu, *node_cache;
>> + int cpu;
>> + int level = 0;
>> +
>> + for_each_possible_cpu(cpu) {
>> + node_cpu = of_get_cpu_node(cpu, NULL);
>> + if (!node_cpu)
>> + continue;
>> +
>> + level = 0;
>> + node_cache = node_cpu;
>> + while (level < MAX_CACHE_LEVEL) {
>> + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
>> + if (!node_cache)
>> + break;
>> +
>> + cache_topology[cpu][level++] = node_cache;
>> + }
>> + of_node_put(node_cpu);
>> + }
>> +}
>
>From where is init_cpu_cache_topology() called?
All arch which support GENERIC_ARCH_TOPOLOGY can be called,
I will take arm64 as an example in V2.
>
>> +bool cpu_share_llc(int cpu1, int cpu2)
>> +{
>> + int cache_level;
>> +
>> + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
>> + if (!cache_topology[cpu1][cache_level])
>> + continue;
>> +
>> + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
>> + return true;
>> +
>> + return false;
>> + }
>> +
>> + return false;
>> +}
>
>Like I mentioned in:
>
>https://lkml.kernel.org/r/73b491fe-b5e8-ebca-081e-fa339cc903e1@arm.com
>
>the correct setting in DT's cpu-map node (only core nodes in your case
>(One DynamIQ cluster) will give you the correct LLC (highest
>SD_SHARE_PKG_RESOURCES) setting.
Thanks, I have seen your reply, but not 100% agree.
1.A cluster not only considers the cache, but also a set of the same arch,capability,
and policy. So we can't simply change its cluster(little, medium, big) even if DSU.
2.The cpu-map node can't give the correct LLC, it just give the correct cluster/core group
>
>https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/topology.txt
>
>> +
>> +bool cpu_share_l2c(int cpu1, int cpu2)
>> +{
>> + if (!cache_topology[cpu1][0])
>> + return false;
>> +
>> + if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> /*
>> * cpu topology table
>> */
>> @@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>> /* not numa in package, lets use the package siblings */
>> core_mask = &cpu_topology[cpu].core_sibling;
>> }
>> - if (cpu_topology[cpu].llc_id != -1) {
>> + if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
>> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>> core_mask = &cpu_topology[cpu].llc_sibling;
>> }
>> @@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
>> for_each_online_cpu(cpu) {
>> cpu_topo = &cpu_topology[cpu];
>>
>> - if (cpuid_topo->llc_id == cpu_topo->llc_id) {
>> + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
>> + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
>
>Assuming a:
>
> .---------------.
>CPU |0 1 2 3 4 5 6 7|
> +---------------+
>uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
> +---------------+
> L2 | | | | | | |
We only discuss the 4-core system here, but it should be:
L2 | | | | | | | |
> +---------------+
> L3 |<-- -->|
> +---------------+
> |<-- cluster -->|
> +---------------+
> |<-- DSU -->|
> '---------------'
>
>system, I guess you would get (w/ Phantom SD and L2/L3 cache info in DT):
>
>CPU0 .. 3:
>
>MC SD_SHARE_PKG_RESOURCES
>DIE no SD_SHARE_PKG_RESOURCES
>
>CPU 4...7:
>
>DIE no SD_SHARE_PKG_RESOURCES
>
>I can't see how this would make any sense ...
>
>Reason is cpu_share_llc(). You don't check cache_level=0 and w/
>
>CPU0 .. 3:
>cache_topology[CPUX][0] == L2
>cache_topology[CPUX][1] == L3
>
>CPU4...7:
>cache_topology[CPUX][0] == L3
No, I didn't describe the cache topology of cpu[4,7] in the commit log,
cause that's not the point, I use 4 cores(3 level cache) as an example.
CPU4...7 if needed:
cache_topology[CPUX][0] == L2
cache_topology[CPUX][1] == L3
cache_topology[CPUX][level], level is strictly corresponding to the cache level
>
>there is, except for CPU0-1 and CPU2-3, no LLC match.
All 4 CPUs match the LLC in this case.
Thanks,
Wang
>
>[...]
Powered by blists - more mailing lists