[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebd0a6bf-cb55-47ba-b9dc-65f56b065ab0@intel.com>
Date: Mon, 21 Jul 2025 16:40:32 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tony.luck@...el.com>,
<Dave.Martin@....com>, <james.morse@....com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <akpm@...ux-foundation.org>,
<paulmck@...nel.org>, <rostedt@...dmis.org>, <Neeraj.Upadhyay@....com>,
<david@...hat.com>, <arnd@...db.de>, <fvdl@...gle.com>, <seanjc@...gle.com>,
<thomas.lendacky@....com>, <pawan.kumar.gupta@...ux.intel.com>,
<yosry.ahmed@...ux.dev>, <sohil.mehta@...el.com>, <xin@...or.com>,
<kai.huang@...el.com>, <xiaoyao.li@...el.com>, <peterz@...radead.org>,
<me@...aill.net>, <mario.limonciello@....com>, <xin3.li@...el.com>,
<ebiggers@...gle.com>, <ak@...ux.intel.com>, <chang.seok.bae@...el.com>,
<andrew.cooper3@...rix.com>, <perry.yuan@....com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 07/10] fs/resctrl: Add user interface to enable/disable
io_alloc feature
Hi Babu,
On 7/10/25 10:16 AM, Babu Moger wrote:
> "io_alloc" feature in resctrl enables direct insertion of data from I/O
> devices into the cache.
>
> Introduce user interface to enable/disable io_alloc feature.
I think it is worth a mention *why* a user may want to disable this feature and
why is not just always enabled. Here it can be highlighted that this feature
may take resources (CLOSID) away from general (CPU) cache allocation and since
this may be scarce enabling user to disable this feature supports different use cases.
>
> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
> Injection Allocation Enforcement). When enabled, SDCIAE directs all SDCI
> lines to be placed into the L3 cache partitions specified by the register
> corresponding to the highest CLOSID supported by the resource. With CDP
> enabled, io_alloc routes I/O traffic using the highest CLOSID assigned to
> the instruction cache (L3CODE).
This is a lot of architecture specific text for a resctrl fs patch ... I think
you are trying to motivate the resctrl fs implementation. Similar motivation
as proposed for cover letter can be used here to help explain the implementation
choices.
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
...
> ---
> Documentation/filesystems/resctrl.rst | 8 ++
> fs/resctrl/rdtgroup.c | 149 +++++++++++++++++++++++++-
> 2 files changed, 156 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 354e6a00fa45..189c1ccb92d6 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -157,6 +157,14 @@ related to allocation:
> "not supported":
> Support not available on the system.
>
> + The feature can be modified by writing to the interface, for example:
> +
> + To enable:
> + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
> +
> + To disable:
> + # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
> +
> The underlying implementation may reduce resources available to
> general (CPU) cache allocation. See architecture specific notes
> below. Depending on usage requirements the feature can be enabled
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index d7c4417b4516..06c854caa55c 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
> static char last_cmd_status_buf[512];
>
> static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>
> static void rdtgroup_destroy_root(void);
>
> @@ -232,6 +233,16 @@ bool closid_allocated(unsigned int closid)
> return !test_bit(closid, closid_free_map);
> }
>
> +static bool resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
> +{
> + return __test_and_clear_bit(io_alloc_closid, closid_free_map);
> +}
> +
> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
> +{
> + closid_free(io_alloc_closid);
> +}
I do not think these should be helpers/wrappers with a separate
namespace. It will make the code easier to understand if it is clear that
the "io_alloc" CLOSID is allocated from the same "pool" as the CLOSID for
control groups.
I would thus propose a specific, for example closid_alloc_fixed(u32 closid)
helper, and just call closid_free() directly.
> +
> /**
> * rdtgroup_mode_by_closid - Return mode of resource group with closid
> * @closid: closid if the resource group
> @@ -1030,6 +1041,16 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
> return 0;
> }
>
> +/*
> + * resctrl_io_alloc_closid_supported() - io_alloc feature utilizes the
> + * highest CLOSID value to direct I/O traffic. Ensure that io_alloc_closid
> + * is in the supported range.
> + */
> +static bool resctrl_io_alloc_closid_supported(u32 io_alloc_closid)
Please move to ctrlmondata.c
> +{
> + return io_alloc_closid < closids_supported();
> +}
> +
> /*
> * resctrl_io_alloc_closid() - io_alloc feature routes I/O traffic using
> * the highest available CLOSID. Retrieve the maximum CLOSID supported by the
> @@ -1858,6 +1879,131 @@ static int resctrl_io_alloc_show(struct kernfs_open_file *of,
> return 0;
> }
>
> +/*
> + * Initialize io_alloc CLOSID cache resource CBM with all usable (shared
> + * and unused) cache portions.
> + */
> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
Mixing two features (io_alloc and CAT) in the function name is confusing.
How about resctrl_io_alloc_init_cbm() and move to ctrlmondata.c?
> + struct resctrl_schema *s, u32 closid)
No need to provide resource as parameter, it can be determined from schema.
> +{
> + int ret;
> +
> + rdt_staged_configs_clear();
> +
> + ret = rdtgroup_init_cat(s, closid);
> + if (ret < 0)
> + goto out;
More below, but I think this flow can be simplified by moving the logic
handling CDP here. If CDP is enabled for the resource then a successful
rdtgroup_init_cat() can just be followed by a snippet that copies the
staged config of the CDP type to the staged config of its peer type.
Their CBMs are supposed to be identical so there is no reason for all the
rdtgroup_init_cat() processing to be repeated. resctrl_arch_update_domains()
can handle updating both in a single call.
> +
> + ret = resctrl_arch_update_domains(r, closid);
> +
> +out:
> + rdt_staged_configs_clear();
> + return ret;
> +}
> +
> +static const char *rdtgroup_name_by_closid(int closid)
This seems generic enough and appropriate for rdtgroup.c
> +{
> + struct rdtgroup *rdtgrp;
> +
> + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
> + if (rdtgrp->closid == closid)
> + return rdt_kn_name(rdtgrp->kn);
> + }
> +
> + return NULL;
> +}
> +
> +static struct resctrl_schema *resctrl_get_schema(enum resctrl_conf_type type)
This also seems generic enough and appropriate for rdtgroup.c
> +{
> + struct resctrl_schema *schema;
> +
> + list_for_each_entry(schema, &resctrl_schema_all, list) {
> + if (schema->conf_type == type)
> + return schema;
> + }
> +
> + return NULL;
> +}
> +
> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + enum resctrl_conf_type peer_type;
> + struct rdt_resource *r = s->res;
> + struct resctrl_schema *peer_s;
> + char const *grp_name;
> + u32 io_alloc_closid;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret)
> + return ret;
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> +
> + if (!r->cache.io_alloc_capable) {
> + rdt_last_cmd_printf("io_alloc is not supported on %s\n", s->name);
> + ret = -ENODEV;
> + goto out_unlock;
> + }
> +
> + io_alloc_closid = resctrl_io_alloc_closid(r);
> + if (!resctrl_io_alloc_closid_supported(io_alloc_closid)) {
> + rdt_last_cmd_printf("io_alloc CLOSID (ctrl_hw_id) %d is not available\n",
> + io_alloc_closid);
> + ret = -EINVAL;
> + goto out_unlock;
> + }
> +
> + /* If the feature is already up to date, no action is needed. */
> + if (resctrl_arch_get_io_alloc_enabled(r) == enable)
> + goto out_unlock;
> +
> + if (enable) {
> + if (!resctrl_io_alloc_closid_alloc(io_alloc_closid)) {
> + grp_name = rdtgroup_name_by_closid(io_alloc_closid);
> + WARN_ON_ONCE(!grp_name);
> + rdt_last_cmd_printf("CLOSID (ctrl_hw_id) %d for io_alloc is used by %s group\n",
> + io_alloc_closid, grp_name ? grp_name : "another");
> + ret = -ENOSPC;
> + goto out_unlock;
> + }
> +
> + /* Initialize schema for both CDP_DATA and CDP_CODE when CDP is enabled */
> + if (resctrl_arch_get_cdp_enabled(r->rid)) {
I think this block can be removed to simplify the flow by moving the CDP handling to
resctrl_io_alloc_init_cat().
> + peer_type = resctrl_peer_type(s->conf_type);
> + peer_s = resctrl_get_schema(peer_type);
> + if (peer_s)
> + ret = resctrl_io_alloc_init_cat(r, peer_s, io_alloc_closid);
> + }
> +
> + if (!ret)
> + ret = resctrl_io_alloc_init_cat(r, s, io_alloc_closid);
> +
> + if (ret) {
> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
> + resctrl_io_alloc_closid_free(io_alloc_closid);
> + goto out_unlock;
> + }
> +
> + } else {
> + resctrl_io_alloc_closid_free(io_alloc_closid);
> + }
> +
> + ret = resctrl_arch_io_alloc_enable(r, enable);
> +
> +out_unlock:
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> +
> + return ret ?: nbytes;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -1950,9 +2096,10 @@ static struct rftype res_common_files[] = {
> },
> {
> .name = "io_alloc",
> - .mode = 0444,
> + .mode = 0644,
> .kf_ops = &rdtgroup_kf_single_ops,
> .seq_show = resctrl_io_alloc_show,
> + .write = resctrl_io_alloc_write,
> },
> {
> .name = "max_threshold_occupancy",
Reinette
Powered by blists - more mailing lists