lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240221092101.90740-3-haifeng.xu@shopee.com>
Date: Wed, 21 Feb 2024 09:21:01 +0000
From: Haifeng Xu <haifeng.xu@...pee.com>
To: reinette.chatre@...el.com
Cc: fenghua.yu@...el.com,
	babu.moger@....com,
	peternewman@...gle.com,
	x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Haifeng Xu <haifeng.xu@...pee.com>
Subject: [PATCH v2 2/2] x86/resctrl: Add tracepoint for llc_occupancy tracking

In our production environment, after removing monitor groups, those unused
RMIDs get stuck in the limbo list forever because their llc_occupancy are
always larger than the threshold. But the unused RMIDs can be successfully
freed by turning up the threshold.

In order to know how much the threshold should be, the following steps can
be taken to acquire the llc_occupancy of RMIDs in each rdt domain:

1) perf probe -a '__rmid_read eventid rmid'
   perf probe -a '__rmid_read%return $retval'
2) perf record -e probe:__rmid_read -e probe:__rmid_read__return -aR sleep 10
3) perf script > __rmid_read.txt
4) cat __rmid_read.txt | grep "eventid=0x1 " -A 1 | grep "kworker" > llc_occupnacy.txt

Instead of using perf tool to track llc_occupancy and filter the log manually,
it is more convenient for users to use tracepoint to do this work. So add a new
tracepoint that shows the llc_occupancy of busy RMIDs when scanning the limbo
list.

Signed-off-by: Haifeng Xu <haifeng.xu@...pee.com>
Suggested-by: Reinette Chatre <reinette.chatre@...el.com>
---
 arch/x86/kernel/cpu/resctrl/monitor.c |  2 ++
 arch/x86/kernel/cpu/resctrl/trace.h   | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f136ac046851..1533b1932b49 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -23,6 +23,7 @@
 #include <asm/resctrl.h>
 
 #include "internal.h"
+#include "trace.h"
 
 struct rmid_entry {
 	u32				rmid;
@@ -302,6 +303,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
 			}
 		}
 		crmid = nrmid + 1;
+		trace_mon_llc_occupancy_limbo(nrmid, d->id, val);
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/resctrl/trace.h b/arch/x86/kernel/cpu/resctrl/trace.h
index 495fb90c8572..4bf95b7b4db8 100644
--- a/arch/x86/kernel/cpu/resctrl/trace.h
+++ b/arch/x86/kernel/cpu/resctrl/trace.h
@@ -35,6 +35,19 @@ TRACE_EVENT(pseudo_lock_l3,
 	    TP_printk("hits=%llu miss=%llu",
 		      __entry->l3_hits, __entry->l3_miss));
 
+TRACE_EVENT(mon_llc_occupancy_limbo,
+	    TP_PROTO(u32 mon_hw_id, int id, u64 occupancy),
+	    TP_ARGS(mon_hw_id, id, occupancy),
+	    TP_STRUCT__entry(__field(u32, mon_hw_id)
+			     __field(int, id)
+			     __field(u64, occupancy)),
+	    TP_fast_assign(__entry->mon_hw_id = mon_hw_id;
+			   __entry->id = id;
+			   __entry->occupancy = occupancy;),
+	    TP_printk("mon_hw_id=%u domain=%d llc_occupancy=%llu",
+		      __entry->mon_hw_id, __entry->id, __entry->occupancy)
+	   );
+
 #endif /* _TRACE_RESCTRL_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ