[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f53db5c6dd7d0580c0cb1f252d947357d097842b.1754436586.git.babu.moger@amd.com>
Date: Tue, 5 Aug 2025 18:30:26 -0500
From: Babu Moger <babu.moger@....com>
To: <corbet@....net>, <tony.luck@...el.com>, <reinette.chatre@...el.com>,
<Dave.Martin@....com>, <james.morse@....com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <akpm@...ux-foundation.org>,
<paulmck@...nel.org>, <rostedt@...dmis.org>, <Neeraj.Upadhyay@....com>,
<david@...hat.com>, <arnd@...db.de>, <fvdl@...gle.com>, <seanjc@...gle.com>,
<thomas.lendacky@....com>, <pawan.kumar.gupta@...ux.intel.com>,
<babu.moger@....com>, <yosry.ahmed@...ux.dev>, <sohil.mehta@...el.com>,
<xin@...or.com>, <kai.huang@...el.com>, <xiaoyao.li@...el.com>,
<peterz@...radead.org>, <me@...aill.net>, <mario.limonciello@....com>,
<xin3.li@...el.com>, <ebiggers@...gle.com>, <ak@...ux.intel.com>,
<chang.seok.bae@...el.com>, <andrew.cooper3@...rix.com>,
<perry.yuan@....com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <manali.shukla@....com>,
<gautham.shenoy@....com>
Subject: [PATCH v8 06/10] fs/resctrl: Add user interface to enable/disable io_alloc feature
"io_alloc" feature in resctrl enables direct insertion of data from I/O
devices into the cache.
On AMD systems, when io_alloc is enabled, the highest CLOSID is reserved
exclusively for I/O allocation traffic and is no longer available for
general CPU cache allocation. Users are encouraged to enable it only when
running workloads that can benefit from this functionality.
Since CLOSIDs are managed by resctrl fs, it is least invasive to make the
"io_alloc is supported by maximum supported CLOSID" part of the initial
resctrl fs support for io_alloc. Take care not to expose this use of CLOSID
for io_alloc to user space so that this is not required from other
architectures that may support io_alloc differently in the future.
Introduce user interface to enable/disable io_alloc feature.
Signed-off-by: Babu Moger <babu.moger@....com>
---
v8: Updated the changelog.
Renamed resctrl_io_alloc_init_cat() to resctrl_io_alloc_init_cbm().
Moved resctrl_io_alloc_write() and all its dependancies to fs/resctrl/ctrlmondata.c.
Added prototypes for all the functions in fs/resctrl/internal.h.
v7: Separated resctrl_io_alloc_show and bit_usage changes in two separate
patches.
Added resctrl_io_alloc_closid_supported() to verify io_alloc CLOSID.
Initialized the schema for both CDP_DATA and CDP_CODE when CDP is enabled.
v6: Changed the subject to fs/resctrl:
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
Used rdt_kn_name to get the rdtgroup name instead of accesssing it directly
while printing group name used by the io_alloc_closid.
Updated bit_usage to reflect the io_alloc CBM as discussed in the thread:
https://lore.kernel.org/lkml/3ca0a5dc-ad9c-4767-9011-b79d986e1e8d@intel.com/
Modified rdt_bit_usage_show() to read io_alloc_cbm in hw_shareable, ensuring
that bit_usage accurately represents the CBMs.
Updated the code to modify io_alloc either with L3CODE or L3DATA.
https://lore.kernel.org/lkml/c00c00ea-a9ac-4c56-961c-dc5bf633476b@intel.com/
v4: Updated the change log.
Updated the user doc.
The "io_alloc" interface will report "enabled/disabled/not supported".
Updated resctrl_io_alloc_closid_get() to verify the max closid availability.
Updated the documentation for "shareable_bits" and "bit_usage".
Introduced io_alloc_init() to initialize fflags.
Printed the group name when io_alloc enablement fails.
NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
handle bit_usage for specific CLOS. Its about overall system. So, we cannot
really tell the user which CLOS is shared across both hardware and software.
We need to discuss this.
v3: Rewrote the change to make it generic.
Rewrote the documentation in resctrl.rst to be generic and added
AMD feature details in the end.
Added the check to verify if MAX CLOSID availability on the system.
Added CDP check to make sure io_alloc is configured in CDP_CODE.
Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
Added errors in few cases when CLOSID allocation fails.
Fixes splat reported when info/L3/bit_usage is accesed when io_alloc
is enabled.
https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/
v2: Renamed the feature to "io_alloc".
Added generic texts for the feature in commit log and resctrl.rst doc.
Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
values when enabled.
Fixed io_alloc show functinality to display only on L3 resource.
---
Documentation/filesystems/resctrl.rst | 8 ++
fs/resctrl/ctrlmondata.c | 137 ++++++++++++++++++++++++++
fs/resctrl/internal.h | 11 +++
fs/resctrl/rdtgroup.c | 24 ++++-
4 files changed, 177 insertions(+), 3 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index fff7e04d1e2a..bd0a633afbb9 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -150,6 +150,14 @@ related to allocation:
"not supported":
Support not available for this resource.
+ The feature can be modified by writing to the interface, for example:
+
+ To enable:
+ # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+
+ To disable:
+ # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
+
The underlying implementation may reduce resources available to
general (CPU) cache allocation. See architecture specific notes
below. Depending on usage requirements the feature can be enabled
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index d495a5d5c9d5..bf982eab7b18 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -685,3 +685,140 @@ int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq, voi
return 0;
}
+
+/*
+ * resctrl_io_alloc_closid_supported() - io_alloc feature utilizes the
+ * highest CLOSID value to direct I/O traffic. Ensure that io_alloc_closid
+ * is in the supported range.
+ */
+static bool resctrl_io_alloc_closid_supported(u32 io_alloc_closid)
+{
+ return io_alloc_closid < closids_supported();
+}
+
+static struct resctrl_schema *resctrl_get_schema(enum resctrl_conf_type type)
+{
+ struct resctrl_schema *schema;
+
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ if (schema->conf_type == type)
+ return schema;
+ }
+
+ return NULL;
+}
+
+/*
+ * Initialize io_alloc CLOSID cache resource CBM with all usable (shared
+ * and unused) cache portions.
+ */
+static int resctrl_io_alloc_init_cbm(struct resctrl_schema *s, u32 closid)
+{
+ struct rdt_resource *r = s->res;
+ enum resctrl_conf_type peer_type;
+ struct resctrl_schema *peer_s;
+ int ret;
+
+ rdt_staged_configs_clear();
+
+ ret = rdtgroup_init_cat(s, closid);
+ if (ret < 0)
+ goto out;
+
+ /* Initialize schema for both CDP_DATA and CDP_CODE when CDP is enabled */
+ if (resctrl_arch_get_cdp_enabled(r->rid)) {
+ peer_type = resctrl_peer_type(s->conf_type);
+ peer_s = resctrl_get_schema(peer_type);
+ if (peer_s) {
+ ret = rdtgroup_init_cat(peer_s, closid);
+ if (ret < 0)
+ goto out;
+ }
+ }
+
+ ret = resctrl_arch_update_domains(r, closid);
+
+out:
+ rdt_staged_configs_clear();
+ return ret;
+}
+
+/*
+ * resctrl_io_alloc_closid() - io_alloc feature routes I/O traffic using
+ * the highest available CLOSID. Retrieve the maximum CLOSID supported by the
+ * resource. Note that if Code Data Prioritization (CDP) is enabled, the number
+ * of available CLOSIDs is reduced by half.
+ */
+static u32 resctrl_io_alloc_closid(struct rdt_resource *r)
+{
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ return resctrl_arch_get_num_closid(r) / 2 - 1;
+ else
+ return resctrl_arch_get_num_closid(r) - 1;
+}
+
+ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ char const *grp_name;
+ u32 io_alloc_closid;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret)
+ return ret;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable) {
+ rdt_last_cmd_printf("io_alloc is not supported on %s\n", s->name);
+ ret = -ENODEV;
+ goto out_unlock;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid(r);
+ if (!resctrl_io_alloc_closid_supported(io_alloc_closid)) {
+ rdt_last_cmd_printf("io_alloc CLOSID (ctrl_hw_id) %d is not available\n",
+ io_alloc_closid);
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ /* If the feature is already up to date, no action is needed. */
+ if (resctrl_arch_get_io_alloc_enabled(r) == enable)
+ goto out_unlock;
+
+ if (enable) {
+ if (!closid_alloc_fixed(io_alloc_closid)) {
+ grp_name = rdtgroup_name_by_closid(io_alloc_closid);
+ WARN_ON_ONCE(!grp_name);
+ rdt_last_cmd_printf("CLOSID (ctrl_hw_id) %d for io_alloc is used by %s group\n",
+ io_alloc_closid, grp_name ? grp_name : "another");
+ ret = -ENOSPC;
+ goto out_unlock;
+ }
+
+ ret = resctrl_io_alloc_init_cbm(s, io_alloc_closid);
+ if (ret) {
+ rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
+ closid_free(io_alloc_closid);
+ goto out_unlock;
+ }
+ } else {
+ closid_free(io_alloc_closid);
+ }
+
+ ret = resctrl_arch_io_alloc_enable(r, enable);
+
+out_unlock:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 1a4543c2b988..335def7af1f6 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -373,6 +373,8 @@ void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
+bool closid_alloc_fixed(u32 closid);
+
int resctrl_find_cleanest_closid(void);
int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq,
@@ -380,6 +382,15 @@ int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq,
void *rdt_kn_parent_priv(struct kernfs_node *kn);
+int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
+
+enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type);
+
+ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off);
+
+const char *rdtgroup_name_by_closid(int closid);
+
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index acc9edc1a268..380ebc86c748 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -232,6 +232,11 @@ bool closid_allocated(unsigned int closid)
return !test_bit(closid, closid_free_map);
}
+bool closid_alloc_fixed(u32 closid)
+{
+ return __test_and_clear_bit(closid, closid_free_map);
+}
+
/**
* rdtgroup_mode_by_closid - Return mode of resource group with closid
* @closid: closid if the resource group
@@ -1250,7 +1255,7 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of,
return 0;
}
-static enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
+enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
{
switch (my_type) {
case CDP_CODE:
@@ -1803,6 +1808,18 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}
+const char *rdtgroup_name_by_closid(int closid)
+{
+ struct rdtgroup *rdtgrp;
+
+ list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+ if (rdtgrp->closid == closid)
+ return rdt_kn_name(rdtgrp->kn);
+ }
+
+ return NULL;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1895,9 +1912,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "io_alloc",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = resctrl_io_alloc_show,
+ .write = resctrl_io_alloc_write,
},
{
.name = "max_threshold_occupancy",
@@ -3365,7 +3383,7 @@ static int __init_one_rdt_domain(struct rdt_ctrl_domain *d, struct resctrl_schem
* If there are no more shareable bits available on any domain then
* the entire allocation will fail.
*/
-static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
+int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
{
struct rdt_ctrl_domain *d;
int ret;
--
2.34.1
Powered by blists - more mailing lists