lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240719092545.50441-3-Dhananjay.Ugwekar@amd.com>
Date: Fri, 19 Jul 2024 09:25:46 +0000
From: Dhananjay Ugwekar <Dhananjay.Ugwekar@....com>
To: <peterz@...radead.org>, <mingo@...hat.com>, <acme@...nel.org>,
	<namhyung@...nel.org>, <mark.rutland@....com>,
	<alexander.shishkin@...ux.intel.com>, <jolsa@...nel.org>,
	<irogers@...gle.com>, <adrian.hunter@...el.com>, <kan.liang@...ux.intel.com>,
	<tglx@...utronix.de>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, <rui.zhang@...el.com>
CC: <linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-pm@...r.kernel.org>, <ananth.narayan@....com>,
	<gautham.shenoy@....com>, <kprateek.nayak@....com>, <ravi.bangoria@....com>,
	<sandipan.das@....com>, <Dhananjay.Ugwekar@....com>, Michael Larabel
	<michael@...haellarabel.com>
Subject: [PATCH 2/2] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs

After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
on AMD processors that support extended CPUID leaf 0x80000026, the
topology_logical_die_id() macros, no longer returns package id, instead it
returns the CCD (Core Complex Die) id. This leads to the energy-pkg
event scope to be modified to CCD instead of package.

For more historical context, please refer to commit 32fb480e0a2c
("powercap/intel_rapl: Support multi-die/package"), which initially changed
the RAPL scope from package to die for all systems, as Intel systems
with Die enumeration have RAPL scope as die, and those without die
enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
correctly with topology_logical_die_id() until recently, but this changed
after the "0x80000026 leaf" commit mentioned above.

Replacing topology_logical_die_id() with topology_physical_package_id()
conditionally only for AMD and Hygon fixes the energy-pkg event.

On an AMD 2 socket 8 CCD Zen5 server:

Before:

linux$ ls /sys/class/powercap/
intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
intel-rapl:f

After:

linux$ ls /sys/class/powercap/
intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0

Only one sysfs entry per-event per-package is created after this change.

Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
Reported-by: Michael Larabel <michael@...haellarabel.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@....com>
---
 drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 3cffa6c79538..2f24ca764408 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -2128,6 +2128,18 @@ void rapl_remove_package(struct rapl_package *rp)
 }
 EXPORT_SYMBOL_GPL(rapl_remove_package);
 
+/*
+ * Intel systems that enumerate DIE domain have RAPL domains implemented
+ * per-die, however, the same is not true for AMD and Hygon processors
+ * where RAPL domains for PKG energy are in-fact per-PKG. Since
+ * logical_die_id is same as logical_package_id in absence of DIE
+ * enumeration, use topology_logical_die_id() on Intel systems and
+ * topology_logical_package_id() on AMD and Hygon systems.
+ */
+#define rapl_pmu_is_pkg_scope()				\
+	(boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||	\
+	 boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+
 /* caller to ensure CPU hotplug lock is held */
 struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
 							 bool id_is_cpu)
@@ -2136,7 +2148,8 @@ struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
 	int uid;
 
 	if (id_is_cpu)
-		uid = topology_logical_die_id(id);
+		uid = rapl_pmu_is_pkg_scope() ?
+		      topology_physical_package_id(id) : topology_logical_die_id(id);
 	else
 		uid = id;
 
@@ -2168,9 +2181,10 @@ struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
 		return ERR_PTR(-ENOMEM);
 
 	if (id_is_cpu) {
-		rp->id = topology_logical_die_id(id);
+		rp->id = rapl_pmu_is_pkg_scope() ?
+			 topology_physical_package_id(id) : topology_logical_die_id(id);
 		rp->lead_cpu = id;
-		if (topology_max_dies_per_package() > 1)
+		if (!rapl_pmu_is_pkg_scope() && topology_max_dies_per_package() > 1)
 			snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
 				 topology_physical_package_id(id), topology_die_id(id));
 		else
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ