lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220713133344.1201247-1-sudeep.holla@arm.com>
Date:   Wed, 13 Jul 2022 14:33:44 +0100
From:   Sudeep Holla <sudeep.holla@....com>
To:     linux-kernel@...r.kernel.org, conor.dooley@...rochip.com
Cc:     Sudeep Holla <sudeep.holla@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ionela Voinescu <ionela.voinescu@....com>,
        Pierre Gondois <pierre.gondois@....com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-riscv@...ts.infradead.org
Subject: [PATCH -next] arch_topology: Fix cache attributes detection in the CPU hotplug path

init_cpu_topology() is called only once at the boot and all the cache
attributes are detected early for all the possible CPUs. However when
the CPUs are hotplugged out, the cacheinfo gets removed. While the
attributes are added back when the CPUs are hotplugged back in as part
of CPU hotplug state machine, it ends up called quite late after the
update_siblings_masks() are called in the secondary_start_kernel()
resulting in wrong llc_sibling_masks.

Move the call to detect_cache_attributes() inside update_siblings_masks()
to ensure the cacheinfo is updated before the LLC sibling masks are
updated. This will fix the incorrect LLC sibling masks generated when
the CPUs are hotplugged out and hotplugged back in again.

Reported-by: Ionela Voinescu <ionela.voinescu@....com>
Signed-off-by: Sudeep Holla <sudeep.holla@....com>
---
 drivers/base/arch_topology.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

Hi Conor,

Ionela reported an issue with the CPU hotplug and as a fix I need to
move the call to detect_cache_attributes() which I had thought to keep
it there from first but for no reason had moved it to init_cpu_topology().

Wonder if this fixes the -ENOMEM on RISC-V as this one is called on the
cpu in the secondary CPUs init path while init_cpu_topology executed
detect_cache_attributes() for all possible CPUs much earlier. I think
this might help as the percpu memory might be initialised in this case.

Anyways give this a try, also test the CPU hotplug and check if nothing
is broken on RISC-V. We noticed this bug only on one platform while

Regards,
Sudeep

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 441e14ac33a4..0424b59b695e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -732,7 +732,11 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
 void update_siblings_masks(unsigned int cpuid)
 {
 	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
-	int cpu;
+	int cpu, ret;
+
+	ret = detect_cache_attributes(cpuid);
+	if (ret)
+		pr_info("Early cacheinfo failed, ret = %d\n", ret);
 	/* update core and thread sibling masks */
 	for_each_online_cpu(cpu) {
@@ -821,7 +825,7 @@ __weak int __init parse_acpi_topology(void)
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
-	int ret, cpu;
+	int ret;
 	reset_cpu_topology();
 	ret = parse_acpi_topology();
@@ -836,13 +840,5 @@ void __init init_cpu_topology(void)
 		reset_cpu_topology();
 		return;
 	}
-
-	for_each_possible_cpu(cpu) {
-		ret = detect_cache_attributes(cpu);
-		if (ret) {
-			pr_info("Early cacheinfo failed, ret = %d\n", ret);
-			break;
-		}
-	}
 }
 #endif
--2.37.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ