[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241128002247.26726-3-ricardo.neri-calderon@linux.intel.com>
Date: Wed, 27 Nov 2024 16:22:47 -0800
From: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To: x86@...nel.org
Cc: Andreas Herrmann <aherrmann@...e.com>,
Catalin Marinas <catalin.marinas@....com>,
Chen Yu <yu.c.chen@...el.com>,
Len Brown <len.brown@...el.com>,
Radu Rendec <rrendec@...hat.com>,
Pierre Gondois <Pierre.Gondois@....com>,
Pu Wen <puwen@...on.cn>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Sudeep Holla <sudeep.holla@....com>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Will Deacon <will@...nel.org>,
Zhang Rui <rui.zhang@...el.com>,
Nikolay Borisov <nik.borisov@...e.com>,
Huang Ying <ying.huang@...el.com>,
Ricardo Neri <ricardo.neri@...el.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH v8 2/2] x86/cacheinfo: Delete global num_cache_leaves
Linux remembers cpu_cachinfo::num_leaves per CPU, but x86 initializes all
CPUs from the same global "num_cache_leaves".
This is erroneous on systems such as Meteor Lake, where each CPU has a
distinct num_leaves value. Delete the global "num_cache_leaves" and
initialize num_leaves on each CPU.
init_cache_level() no longer needs to set num_leaves. Also, it never had to
set num_levels as it is unnecessary in x86. Keep checking for zero cache
leaves. Such condition indicates a bug.
Reviewed-by: Andreas Herrmann <aherrmann@...e.de>
Reviewed-by: Len Brown <len.brown@...el.com>
Reviewed-by: Nikolay Borisov <nik.borisov@...e.com>
Tested-by: Andreas Herrmann <aherrmann@...e.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
---
Cc: Andreas Herrmann <aherrmann@...e.com>
Cc: Catalin Marinas <catalin.marinas@....com>
Cc: Chen Yu <yu.c.chen@...el.com>
Cc: Huang Ying <ying.huang@...el.com>
Cc: Len Brown <len.brown@...el.com>
Cc: Nikolay Borisov <nik.borisov@...e.com>
Cc: Radu Rendec <rrendec@...hat.com>
Cc: Pierre Gondois <Pierre.Gondois@....com>
Cc: Pu Wen <puwen@...on.cn>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc: Sudeep Holla <sudeep.holla@....com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc: Will Deacon <will@...nel.org>
Cc: Zhang Rui <rui.zhang@...el.com>
Cc: linux-arm-kernel@...ts.infradead.org
Cc: stable@...r.kernel.org # 6.3+
---
After this change, all CPUs will traverse CPUID leaf 0x4 when booted for
the first time. On systems with symmetric cache topologies this is
useless work.
Creating a list of processor models that have asymmetric cache topologies
was considered. The burden of maintaining such list would outweigh the
performance benefit of skipping this extra step.
---
Changes since v7:
* Removed an ugly linebreak. (Boris)
* Folded patch 3/3 into 2/3 as both patches deal with init_cache_level().
(Boris)
* Removed the [set,get]_num_cache_leaves() wrappers. Instead, use the
existing get_cpu_cacheinfo(). (Boris)
* Future-proof init_cache_level() for cases in which cpu_cacheinfo::
num_leaves is still zero afer cache info initialization.
Changes since v6:
* None
Changes since v5:
* Reordered the arguments of set_num_cache_leaves() for readability.
(Nikolay)
* Added Reviewed-by tag from Nikolay and Andreas. Thanks!
* Added Tested-by tag from Andreas. Thanks!
Changes since v4:
* None
Changes since v3:
* Rebased on v6.7-rc5.
Changes since v2:
* None
Changes since v1:
* Do not make num_cache_leaves a per-CPU variable. Instead, reuse the
existing per-CPU ci_cpu_cacheinfo variable. (Dave Hansen)
---
arch/x86/kernel/cpu/cacheinfo.c | 41 +++++++++++++++------------------
1 file changed, 18 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 392d09c936d6..95e38ab98a72 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -178,8 +178,6 @@ struct _cpuid4_info_regs {
struct amd_northbridge *nb;
};
-static unsigned short num_cache_leaves;
-
/* AMD doesn't have CPUID4. Emulate it here to report the same
information to the user. This makes some assumptions about the machine:
L2 not shared, no SMT etc. that is currently true on AMD CPUs.
@@ -718,19 +716,21 @@ void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c)
void init_amd_cacheinfo(struct cpuinfo_x86 *c)
{
+ unsigned int cpu = c->cpu_index;
+
if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
- num_cache_leaves = find_num_cache_leaves(c);
+ get_cpu_cacheinfo(cpu)->num_leaves = find_num_cache_leaves(c);
} else if (c->extended_cpuid_level >= 0x80000006) {
if (cpuid_edx(0x80000006) & 0xf000)
- num_cache_leaves = 4;
+ get_cpu_cacheinfo(cpu)->num_leaves = 4;
else
- num_cache_leaves = 3;
+ get_cpu_cacheinfo(cpu)->num_leaves = 3;
}
}
void init_hygon_cacheinfo(struct cpuinfo_x86 *c)
{
- num_cache_leaves = find_num_cache_leaves(c);
+ get_cpu_cacheinfo(c->cpu_index)->num_leaves = find_num_cache_leaves(c);
}
void init_intel_cacheinfo(struct cpuinfo_x86 *c)
@@ -742,19 +742,18 @@ void init_intel_cacheinfo(struct cpuinfo_x86 *c)
unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
if (c->cpuid_level > 3) {
- static int is_initialized;
-
- if (is_initialized == 0) {
- /* Init num_cache_leaves from boot CPU */
- num_cache_leaves = find_num_cache_leaves(c);
- is_initialized++;
- }
+ /*
+ * There should be at least one leaf. A non-zero value means
+ * that the number of leaves has been initialized.
+ */
+ if (!get_cpu_cacheinfo(c->cpu_index)->num_leaves)
+ get_cpu_cacheinfo(c->cpu_index)->num_leaves = find_num_cache_leaves(c);
/*
* Whenever possible use cpuid(4), deterministic cache
* parameters cpuid leaf to find the cache details
*/
- for (i = 0; i < num_cache_leaves; i++) {
+ for (i = 0; i < get_cpu_cacheinfo(c->cpu_index)->num_leaves; i++) {
struct _cpuid4_info_regs this_leaf = {};
int retval;
@@ -790,14 +789,14 @@ void init_intel_cacheinfo(struct cpuinfo_x86 *c)
* Don't use cpuid2 if cpuid4 is supported. For P4, we use cpuid2 for
* trace cache
*/
- if ((num_cache_leaves == 0 || c->x86 == 15) && c->cpuid_level > 1) {
+ if ((!get_cpu_cacheinfo(c->cpu_index)->num_leaves || c->x86 == 15) && c->cpuid_level > 1) {
/* supports eax=2 call */
int j, n;
unsigned int regs[4];
unsigned char *dp = (unsigned char *)regs;
int only_trace = 0;
- if (num_cache_leaves != 0 && c->x86 == 15)
+ if (get_cpu_cacheinfo(c->cpu_index)->num_leaves && c->x86 == 15)
only_trace = 1;
/* Number of times to iterate */
@@ -991,14 +990,10 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
int init_cache_level(unsigned int cpu)
{
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-
- if (!num_cache_leaves)
+ /* There should be at least one leaf. */
+ if (!get_cpu_cacheinfo(cpu)->num_leaves)
return -ENOENT;
- if (!this_cpu_ci)
- return -EINVAL;
- this_cpu_ci->num_levels = 3;
- this_cpu_ci->num_leaves = num_cache_leaves;
+
return 0;
}
--
2.34.1
Powered by blists - more mailing lists