[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee19ea57-96e7-43d0-ab27-3dd12fb549bc@intel.com>
Date: Fri, 11 Apr 2025 15:04:27 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <tony.luck@...el.com>,
<peternewman@...gle.com>
CC: <corbet@....net>, <tglx@...utronix.de>, <mingo@...hat.com>,
<bp@...en8.de>, <dave.hansen@...ux.intel.com>, <x86@...nel.org>,
<hpa@...or.com>, <paulmck@...nel.org>, <akpm@...ux-foundation.org>,
<thuth@...hat.com>, <rostedt@...dmis.org>, <ardb@...nel.org>,
<gregkh@...uxfoundation.org>, <daniel.sneddon@...ux.intel.com>,
<jpoimboe@...nel.org>, <alexandre.chartre@...cle.com>,
<pawan.kumar.gupta@...ux.intel.com>, <thomas.lendacky@....com>,
<perry.yuan@....com>, <seanjc@...gle.com>, <kai.huang@...el.com>,
<xiaoyao.li@...el.com>, <kan.liang@...ux.intel.com>, <xin3.li@...el.com>,
<ebiggers@...gle.com>, <xin@...or.com>, <sohil.mehta@...el.com>,
<andrew.cooper3@...rix.com>, <mario.limonciello@....com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<maciej.wieczor-retman@...el.com>, <eranian@...gle.com>
Subject: Re: [PATCH v12 19/26] x86/resctrl: Add event configuration directory
under info/L3_MON/
Hi Babu
On 4/3/25 5:18 PM, Babu Moger wrote:
> Create the configuration directory and files for mbm_cntr_assign mode.
> These configurations will be used to assign MBM events in mbm_cntr_assign
> mode, with two default configurations created upon mounting.
>
> Example:
> $ cd /sys/fs/resctrl/
> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> local_reads, remote_reads, local_non_temporal_writes,
> remote_non_temporal_writes, local_reads_slow_memory,
> remote_reads_slow_memory, dirty_victim_writes_all
>
> $ cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> local_reads, local_non_temporal_writes, local_reads_slow_memory
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
> v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
> ---
> Documentation/arch/x86/resctrl.rst | 29 ++++++++++
> arch/x86/kernel/cpu/resctrl/internal.h | 2 +
> arch/x86/kernel/cpu/resctrl/monitor.c | 1 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 77 ++++++++++++++++++++++++++
> 4 files changed, 109 insertions(+)
>
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 71ed1cfed33a..99f9f4b9b501 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -306,6 +306,35 @@ with the following files:
> # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
> 0=30;1=30
>
> +"counter_configs:
(mismatch quotes)
This organization needs some extra thought ... consider that the section starts with
"If RDT monitoring is available there will be an "L3_MON" directory
with the following *files*:"
> + The directory for storing event configuration files, which will be used to
> + assign counters when the mbm_cntr_assign mode is enabled.
Needs more imperative tone.
> +
> + Following types of events are supported:
> +
> + ==== ========================= ============================================================
> + Bits Name Description
> + ==== ========================= ============================================================
> + 6 dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory
> + 5 remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain
> + 4 local_reads_slow_memory Reads to slow memory in the local NUMA domain
> + 3 remote_non_temporal_writes Non-temporal writes to non-local NUMA domain
> + 2 local_non_temporal_writes Non-temporal writes to local NUMA domain
> + 1 remote_reads Reads to memory in the non-local NUMA domain
> + 0 local_reads Reads to memory in the local NUMA domain
> + ==== ========================= ==========================================================
> +
> + Two default configurations, mbm_local_bytes and mbm_total_bytes, will be created
"will be created" -> "are created" ... or maybe just:
There are two default configurations: mbm_local_bytes and mbm_total_bytes.
> + upon mounting.
"upon mounting" seems unnecessary.
> + ::
> +
> + # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> + local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
> + local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all
> +
> + # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> + local_reads, local_non_temporal_writes, local_reads_slow_memory
> +
> "max_threshold_occupancy":
> Read/write file provides the largest value (in
> bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index b7d1a59f09f8..a943450bf2c8 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -282,11 +282,13 @@ struct mbm_cntr_cfg {
> #define RFTYPE_RES_CACHE BIT(8)
> #define RFTYPE_RES_MB BIT(9)
> #define RFTYPE_DEBUG BIT(10)
> +#define RFTYPE_CONFIG BIT(11)
hmmm ... these flags are becoming quite complex. Even so, RFTYPE_CONFIG would be
unique to this new feature so I think a more specific name would be appropriate.
Maybe even "RFTYPE_MBM_EVENT_CONFIG".
> #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
> #define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
> #define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
> #define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
> #define RFTYPE_MON_BASE (RFTYPE_BASE | RFTYPE_MON)
> +#define RFTYPE_MON_CONFIG (RFTYPE_CONFIG | RFTYPE_MON)
Why is this flag needed?
>
> /* List of all resource groups */
> extern struct list_head rdt_all_groups;
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 58476c065921..4525295b1725 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1264,6 +1264,7 @@ int __init resctrl_mon_resource_init(void)
> if (r->mon.mbm_cntr_assignable) {
> resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
> resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
> + resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
> }
>
> return 0;
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index aba23e2096db..b2122a1dd36c 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1907,6 +1907,25 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
> return ret ?: nbytes;
> }
>
> +static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
> +{
> + struct mbm_assign_config *assign_config = of->kn->parent->priv;
> + bool sep = false;
> + int i;
> +
> + for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
> + if (assign_config->val & mbm_evt_values[i].evt_val) {
> + if (sep)
> + seq_puts(seq, ", ");
seq_putc()
> + seq_printf(seq, "%s", mbm_evt_values[i].evt_name);
> + sep = true;
> + }
> + }
> + seq_puts(seq, "\n");
seq_putc()
> +
> + return 0;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -2019,6 +2038,12 @@ static struct rftype res_common_files[] = {
> .seq_show = mbm_local_bytes_config_show,
> .write = mbm_local_bytes_config_write,
> },
> + {
> + .name = "event_filter",
> + .mode = 0444,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = event_filter_show,
> + },
> {
> .name = "mbm_assign_mode",
> .mode = 0444,
> @@ -2314,6 +2339,52 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
> return ret;
> }
>
> +static int resctrl_mkdir_info_configs(void *priv, char *name, unsigned long fflags)
Why a void * instead of struct rdt_resource *?
Also please fix spacing.
Also, why do fflags need to be provided as parameter? These are so custom I think the
hardcoding should be contained here instead of the caller. With this the function name
can also be made specific to what it does ... perhaps "resctrl_mkdir_counter_configs()"
(please feel free to improve).
> +{
> + struct kernfs_node *l3_mon_kn, *kn_subdir, *kn_subdir2;
> + int ret, i;
> +
> + l3_mon_kn = kernfs_find_and_get(kn_info, name);
> + if (!l3_mon_kn)
> + return -ENOENT;
> +
> + kn_subdir = kernfs_create_dir(l3_mon_kn, "counter_configs", l3_mon_kn->mode, priv);
> + if (IS_ERR(kn_subdir)) {
> + kernfs_put(l3_mon_kn);
> + return PTR_ERR(kn_subdir);
> + }
> +
> + ret = rdtgroup_kn_set_ugid(kn_subdir);
> + if (ret) {
> + kernfs_put(l3_mon_kn);
> + return ret;
> + }
> +
> + for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
This can instead work through the resource's evt_list and use a flag (TBD how to
adapt "configurable") to determine if a directory should be created for it.
> + kn_subdir2 = kernfs_create_dir(kn_subdir, mbm_assign_configs[i].name,
> + kn_subdir->mode, &mbm_assign_configs[i]);
> + if (IS_ERR(kn_subdir)) {
IS_ERR(kn_subdir2)?
> + ret = PTR_ERR(kn_subdir2);
> + goto config_out;
> + }
> +
> + ret = rdtgroup_kn_set_ugid(kn_subdir2);
> + if (ret)
> + goto config_out;
> +
> + ret = rdtgroup_add_files(kn_subdir2, fflags);
> + if (!ret)
> + kernfs_activate(kn_subdir);
> + }
> +
> +config_out:
> + kernfs_put(l3_mon_kn);
> + if (ret)
> + kernfs_remove(kn_subdir);
> +
> + return ret;
> +}
> +
> static unsigned long fflags_from_resource(struct rdt_resource *r)
> {
> switch (r->rid) {
> @@ -2360,6 +2431,12 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
> if (ret)
> goto out_destroy;
> +
> + if (r->mon.mbm_cntr_assignable) {
> + ret = resctrl_mkdir_info_configs(r, name, RFTYPE_MON_CONFIG);
> + if (ret)
> + goto out_destroy;
> + }
> }
>
> ret = rdtgroup_kn_set_ugid(kn_info);
Reinette
Powered by blists - more mailing lists