[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231201005720.235639-1-babu.moger@amd.com>
Date: Thu, 30 Nov 2023 18:57:05 -0600
From: Babu Moger <babu.moger@....com>
To: <corbet@....net>, <fenghua.yu@...el.com>,
<reinette.chatre@...el.com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
<seanjc@...gle.com>, <kim.phillips@....com>, <babu.moger@....com>,
<jmattson@...gle.com>, <ilpo.jarvinen@...ux.intel.com>,
<jithu.joseph@...el.com>, <kan.liang@...ux.intel.com>,
<nikunj@....com>, <daniel.sneddon@...ux.intel.com>,
<pbonzini@...hat.com>, <rick.p.edgecombe@...el.com>,
<rppt@...nel.org>, <maciej.wieczor-retman@...el.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<eranian@...gle.com>, <peternewman@...gle.com>, <dhagiani@....com>
Subject: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature
These series adds the support for AMD QoS RMID Pinning feature. It is also
called ABMC (Assignable Bandwidth Monitoring Counters) feature.
The feature details are available in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC). The documentation is available at
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
The patches are based on top of commit
346887b65d89ae987698bc1efd8e5536bd180b3f (tip/master)
# Introduction
AMD hardware can support 256 or more RMIDs. However, bandwidth monitoring
feature only guarantees that RMIDs currently assigned to a processor will
be tracked by hardware. The counters of any other RMIDs which are no
longer being tracked will be reset to zero. The MBM event counters return
"Unavailable" for the RMIDs that are not active.
Users can create 256 or more monitor groups. But there can be only limited
number of groups that can be give guaranteed monitoring numbers. With ever
changing system configuration, there is no way to definitely know which of
these groups will be active for certain point of time. Users do not have
the option to monitor a group or set of groups for certain period of time
without worrying about RMID being reset in between.
The ABMC feature provides an option to pin (or assign) the RMID to the
hardware counter and monitor the bandwidth for a longer duration. The
pinned RMID will be active until the user unpins (or unassigns) it. There
is no need to worry about counters being reset during this period.
Additionally, the user can specify a bitmask identifying the specific
bandwidth types from the given source to track with the counter.
# Linux Implementation
Hardware provides total of 32 counters available for assignment.
Each Linux resctrl group can be assigned a maximum of 2 counters. One for
mbm_total_bytes and one for mbm_local_bytes. Users also have the option to
assign only one counter to the group. If the system runs out of assignable
counters, the kernel will display the error when the user attempts a new
counter assignment. Users need to unassign already used counters for new
assignments.
# Examples
a. Check if ABMC support is available
#mount -t resctrl resctrl /sys/fs/resctrl/
#cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_total_bytes_config
mbm_local_bytes
mbm_local_bytes_config
abmc_capable ← Linux kernel detected ABMC feature.
b. Mount with ABMC support
#umount /sys/fs/resctrl/
#mount -o abmc -t resctrl resctrl /sys/fs/resctrl/
c. Read the monitor states. There will be new file "monitor_state"
for each monitor group when ABMC feature is enabled. By default,
both total and local MBM events are in "unassign" state.
#cat /sys/fs/resctrl/monitor_state
total=unassign;local=unassign
d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
events are not available until the user assigns the events explicitly.
Users need to assign the counters to monitor the events in this mode.
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
Unavailable
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
Unavailable
e. Assign a h/w counter to the total event and read the monitor_state.
#echo total=assign > /sys/fs/resctrl/monitor_state
#cat /sys/fs/resctrl/monitor_state
total=assign;local=unassign
f. Now that the total event is assigned. Read the total event.
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
6136000
g. Assign a h/w counter to both total and local events and read the monitor_state.
#echo "total=assign;local=assign" > /sys/fs/resctrl/monitor_state
#cat /sys/fs/resctrl/monitor_state
total=assign;local=assign
h. Now that both total and local events are assigned, read the events.
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
6136000
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
58694
i. Check the bandwidth configuration for the group. Note that bandwidth
configuration has a domain scope. Total event defaults to 0x7F (to
count all the events) and local event defaults to 0x15
(to count all the local numa events). The event bitmap decoding is
available in https://www.kernel.org/doc/Documentation/x86/resctrl.rst
in section "mbm_total_bytes_config", "mbm_local_bytes_config":
#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x7f;1=0x7f
#cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
0=0x15;1=0xi15
j. Change the bandwidth source for domain 0 for the total event to count only reads.
Note that this change effects events on the domain 0.
#echo total=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x33;1=0x7F
k. Now read the total event again. The mbm_total_bytes should display
only the read events.
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
6136000
l. Unmount the resctrl
#umount /sys/fs/resctrl/
NOTE: For simplicity these examples are run on a default resctrl group.
Similar experiments are can be run non-defaults groups.
---
Babu Moger (15):
x86/resctrl: Remove hard-coded memory bandwidth limit
x86/resctrl: Remove hard-coded memory bandwidth event configuration
x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters
(ABMC)
x86/resctrl: Add ABMC feature in the command line options
x86/resctrl: Detect ABMC feature details
x86/resctrl: Add the mount option for ABMC feature
x86/resctrl: Add support to enable/disable ABMC feature
x86/resctrl: Introduce interface to display number of ABMC counters
x86/resctrl: Add interface to display monitor state of the group
x86/resctrl: Initialize ABMC counters bitmap
x86/resctrl: Add data structures for ABMC assignment
x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
x86/resctrl: Add the interface to assign a ABMC counter
x86/resctrl: Add interface unassign a ABMC counter
x86/resctrl: Update ABMC assignment on event configuration changes
.../admin-guide/kernel-parameters.txt | 2 +-
Documentation/arch/x86/resctrl.rst | 52 +++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/kernel/cpu/cpuid-deps.c | 2 +
arch/x86/kernel/cpu/resctrl/core.c | 23 +-
arch/x86/kernel/cpu/resctrl/internal.h | 49 ++-
arch/x86/kernel/cpu/resctrl/monitor.c | 22 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 415 +++++++++++++++++-
arch/x86/kernel/cpu/scattered.c | 1 +
include/linux/resctrl.h | 2 +
11 files changed, 562 insertions(+), 9 deletions(-)
--
2.34.1
Powered by blists - more mailing lists