[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd444497162c38f81858ba3b3f95c29058d38232.1761090860.git.babu.moger@amd.com>
Date: Tue, 21 Oct 2025 18:54:49 -0500
From: Babu Moger <babu.moger@....com>
To: <tony.luck@...el.com>, <reinette.chatre@...el.com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <corbet@....net>, <Dave.Martin@....com>, <james.morse@....com>,
<babu.moger@....com>, <x86@...nel.org>, <hpa@...or.com>,
<akpm@...ux-foundation.org>, <paulmck@...nel.org>, <rdunlap@...radead.org>,
<pmladek@...e.com>, <kees@...nel.org>, <arnd@...db.de>, <fvdl@...gle.com>,
<seanjc@...gle.com>, <pawan.kumar.gupta@...ux.intel.com>, <xin@...or.com>,
<thomas.lendacky@....com>, <sohil.mehta@...el.com>, <jarkko@...nel.org>,
<chang.seok.bae@...el.com>, <ebiggers@...gle.com>,
<elena.reshetova@...el.com>, <ak@...ux.intel.com>,
<mario.limonciello@....com>, <perry.yuan@....com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<peternewman@...gle.com>
Subject: [PATCH v10 06/10] fs/resctrl: Add user interface to enable/disable io_alloc feature
AMD's SDCIAE forces all SDCI lines to be placed into the L3 cache portions
identified by the highest-supported L3_MASK_n register, where n is the
maximum supported CLOSID.
To support AMD's SDCIAE, when io_alloc resctrl feature is enabled, reserve
the highest CLOSID exclusively for I/O allocation traffic making it no
longer available for general CPU cache allocation.
Introduce user interface to enable/disable io_alloc feature and encourage
users to enable io_alloc only when running workloads that can benefit from
this functionality. On enable, initialize the io_alloc CLOSID with all
usable CBMs across all the domains.
Since CLOSIDs are managed by resctrl fs, it is least invasive to make
"io_alloc is supported by maximum supported CLOSID" part of the initial
resctrl fs support for io_alloc. Take care to minimally (only in error
messages) expose this use of CLOSID for io_alloc to user space so that this
is not required from other architectures that may support io_alloc
differently in the future.
When resctrl is mounted with "-o cdp" to enable code/data prioritization,
there are two L3 resources that can support I/O allocation: L3CODE and
L3DATA. From resctrl fs perspective the two resources share a CLOSID and
the architecture's available CLOSID are halved to support this. The
architecture's underlying CLOSID used by SDCIAE when CDP is enabled is the
CLOSID associated with the CDP_CODE resource, but from resctrl's perspective
there is only one CLOSID for both CDP_CODE and CDP_DATA. CDP_DATA is thus
not usable for general (CPU) cache allocation nor I/O allocation. Keep the
CDP_CODE and CDP_DATA I/O alloc status in sync to avoid any confusion to
user space. That is, enabling io_alloc on CDP_CODE does so on CDP_DATA and
vice-versa, and keep the I/O allocation CBMs of CDP_CODE and CDP_3DATA in
sync.
Signed-off-by: Babu Moger <babu.moger@....com>
---
v10: Rephrase of the whole changelog to avoid repetion. Thanks to Reinette.
Minor print format changes.
Moved the hardware specific detail to resctrl.rst.
Changed the text L3CODE to CDP_CODE and L3DATA to CDP_DATA
v9: Updated the changelog.
Moved resctrl_arch_get_io_alloc_enabled() check earlier in the
Removed resctrl_get_schema().
Copied staged_config from peer when CDP is enabled.
v8: Updated the changelog.
Renamed resctrl_io_alloc_init_cat() to resctrl_io_alloc_init_cbm().
Moved resctrl_io_alloc_write() and all its dependancies to fs/resctrl/ctrlmondata.c.
Added prototypes for all the functions in fs/resctrl/internal.h.
v7: Separated resctrl_io_alloc_show and bit_usage changes in two separate
patches.
Added resctrl_io_alloc_closid_supported() to verify io_alloc CLOSID.
Initialized the schema for both CDP_DATA and CDP_CODE when CDP is enabled.
v6: Changed the subject to fs/resctrl:
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
Used rdt_kn_name to get the rdtgroup name instead of accesssing it directly
while printing group name used by the io_alloc_closid.
Updated bit_usage to reflect the io_alloc CBM as discussed in the thread:
https://lore.kernel.org/lkml/3ca0a5dc-ad9c-4767-9011-b79d986e1e8d@intel.com/
Modified rdt_bit_usage_show() to read io_alloc_cbm in hw_shareable, ensuring
that bit_usage accurately represents the CBMs.
Updated the code to modify io_alloc either with L3CODE or L3DATA.
https://lore.kernel.org/lkml/c00c00ea-a9ac-4c56-961c-dc5bf633476b@intel.com/
v4: Updated the change log.
Updated the user doc.
The "io_alloc" interface will report "enabled/disabled/not supported".
Updated resctrl_io_alloc_closid_get() to verify the max closid availability.
Updated the documentation for "shareable_bits" and "bit_usage".
Introduced io_alloc_init() to initialize fflags.
Printed the group name when io_alloc enablement fails.
NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
handle bit_usage for specific CLOS. Its about overall system. So, we cannot
really tell the user which CLOS is shared across both hardware and software.
We need to discuss this.
v3: Rewrote the change to make it generic.
Rewrote the documentation in resctrl.rst to be generic and added
AMD feature details in the end.
Added the check to verify if MAX CLOSID availability on the system.
Added CDP check to make sure io_alloc is configured in CDP_CODE.
Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
Added errors in few cases when CLOSID allocation fails.
Fixes splat reported when info/L3/bit_usage is accesed when io_alloc
is enabled.
https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/
v2: Renamed the feature to "io_alloc".
Added generic texts for the feature in commit log and resctrl.rst doc.
Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
values when enabled.
Fixed io_alloc show functinality to display only on L3 resource.
---
Documentation/filesystems/resctrl.rst | 23 +++++
fs/resctrl/ctrlmondata.c | 122 ++++++++++++++++++++++++++
fs/resctrl/internal.h | 11 +++
fs/resctrl/rdtgroup.c | 24 ++++-
4 files changed, 177 insertions(+), 3 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 108995640ca5..89e856e6892c 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -152,6 +152,29 @@ related to allocation:
"not supported":
Support not available for this resource.
+ The feature can be modified by writing to the interface, for example:
+
+ To enable:
+ # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+
+ To disable:
+ # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
+
+ The underlying implementation may reduce resources available to
+ general (CPU) cache allocation. See architecture specific notes
+ below. Depending on usage requirements the feature can be enabled
+ or disabled.
+
+ On AMD systems, io_alloc feature is supported by the L3 Smart
+ Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
+ io_alloc is the highest CLOSID supported by the resource. When
+ io_alloc is enabled, the highest CLOSID is dedicated to io_alloc and
+ no longer available for general (CPU) cache allocation. When CDP is
+ enabled, io_alloc routes I/O traffic using the highest CLOSID allocated
+ for the instruction cache (CDP_CODE), making this CLOSID no longer
+ available for general (CPU) cache allocation for both the CDP_CODE
+ and CDP_DATA resources.
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index 78a8e7b4ba24..bdec06bcf3e3 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -697,3 +697,125 @@ int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq, voi
return 0;
}
+
+/*
+ * resctrl_io_alloc_closid_supported() - io_alloc feature utilizes the
+ * highest CLOSID value to direct I/O traffic. Ensure that io_alloc_closid
+ * is in the supported range.
+ */
+static bool resctrl_io_alloc_closid_supported(u32 io_alloc_closid)
+{
+ return io_alloc_closid < closids_supported();
+}
+
+/*
+ * Initialize io_alloc CLOSID cache resource CBM with all usable (shared
+ * and unused) cache portions.
+ */
+static int resctrl_io_alloc_init_cbm(struct resctrl_schema *s, u32 closid)
+{
+ enum resctrl_conf_type peer_type;
+ struct rdt_resource *r = s->res;
+ struct rdt_ctrl_domain *d;
+ int ret;
+
+ rdt_staged_configs_clear();
+
+ ret = rdtgroup_init_cat(s, closid);
+ if (ret < 0)
+ goto out;
+
+ /* Keep CDP_CODE and CDP_DATA of io_alloc CLOSID's CBM in sync. */
+ if (resctrl_arch_get_cdp_enabled(r->rid)) {
+ peer_type = resctrl_peer_type(s->conf_type);
+ list_for_each_entry(d, &s->res->ctrl_domains, hdr.list)
+ memcpy(&d->staged_config[peer_type],
+ &d->staged_config[s->conf_type],
+ sizeof(d->staged_config[0]));
+ }
+
+ ret = resctrl_arch_update_domains(r, closid);
+out:
+ rdt_staged_configs_clear();
+ return ret;
+}
+
+/*
+ * resctrl_io_alloc_closid() - io_alloc feature routes I/O traffic using
+ * the highest available CLOSID. Retrieve the maximum CLOSID supported by the
+ * resource. Note that if Code Data Prioritization (CDP) is enabled, the number
+ * of available CLOSIDs is reduced by half.
+ */
+static u32 resctrl_io_alloc_closid(struct rdt_resource *r)
+{
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ return resctrl_arch_get_num_closid(r) / 2 - 1;
+ else
+ return resctrl_arch_get_num_closid(r) - 1;
+}
+
+ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ char const *grp_name;
+ u32 io_alloc_closid;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret)
+ return ret;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable) {
+ rdt_last_cmd_printf("io_alloc is not supported on %s\n", s->name);
+ ret = -ENODEV;
+ goto out_unlock;
+ }
+
+ /* If the feature is already up to date, no action is needed. */
+ if (resctrl_arch_get_io_alloc_enabled(r) == enable)
+ goto out_unlock;
+
+ io_alloc_closid = resctrl_io_alloc_closid(r);
+ if (!resctrl_io_alloc_closid_supported(io_alloc_closid)) {
+ rdt_last_cmd_printf("io_alloc CLOSID (ctrl_hw_id) %u is not available\n",
+ io_alloc_closid);
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ if (enable) {
+ if (!closid_alloc_fixed(io_alloc_closid)) {
+ grp_name = rdtgroup_name_by_closid(io_alloc_closid);
+ WARN_ON_ONCE(!grp_name);
+ rdt_last_cmd_printf("CLOSID (ctrl_hw_id) %u for io_alloc is used by %s group\n",
+ io_alloc_closid, grp_name ? grp_name : "another");
+ ret = -ENOSPC;
+ goto out_unlock;
+ }
+
+ ret = resctrl_io_alloc_init_cbm(s, io_alloc_closid);
+ if (ret) {
+ rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
+ closid_free(io_alloc_closid);
+ goto out_unlock;
+ }
+ } else {
+ closid_free(io_alloc_closid);
+ }
+
+ ret = resctrl_arch_io_alloc_enable(r, enable);
+
+out_unlock:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index a18ed8889396..26ab8f9b30d8 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -390,6 +390,8 @@ void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
+bool closid_alloc_fixed(u32 closid);
+
int resctrl_find_cleanest_closid(void);
void *rdt_kn_parent_priv(struct kernfs_node *kn);
@@ -428,6 +430,15 @@ ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf, size_t
loff_t off);
int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
+int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
+
+enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type);
+
+ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off);
+
+const char *rdtgroup_name_by_closid(int closid);
+
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 6d89160ebd11..5dec581336dd 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -226,6 +226,11 @@ bool closid_allocated(unsigned int closid)
return !test_bit(closid, closid_free_map);
}
+bool closid_alloc_fixed(u32 closid)
+{
+ return __test_and_clear_bit(closid, closid_free_map);
+}
+
/**
* rdtgroup_mode_by_closid - Return mode of resource group with closid
* @closid: closid if the resource group
@@ -1247,7 +1252,7 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of,
return 0;
}
-static enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
+enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
{
switch (my_type) {
case CDP_CODE:
@@ -1838,6 +1843,18 @@ void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_
kernfs_put(mon_kn);
}
+const char *rdtgroup_name_by_closid(int closid)
+{
+ struct rdtgroup *rdtgrp;
+
+ list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+ if (rdtgrp->closid == closid)
+ return rdt_kn_name(rdtgrp->kn);
+ }
+
+ return NULL;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1949,9 +1966,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "io_alloc",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = resctrl_io_alloc_show,
+ .write = resctrl_io_alloc_write,
},
{
.name = "max_threshold_occupancy",
@@ -3500,7 +3518,7 @@ static int __init_one_rdt_domain(struct rdt_ctrl_domain *d, struct resctrl_schem
* If there are no more shareable bits available on any domain then
* the entire allocation will fail.
*/
-static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
+int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
{
struct rdt_ctrl_domain *d;
int ret;
--
2.34.1
Powered by blists - more mailing lists