[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5c784fb8-fafc-4529-afd4-d1acb71aded0@amd.com>
Date: Thu, 31 Jul 2025 17:52:40 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
Babu Moger <babu.moger@....com>, corbet@....net, tony.luck@...el.com,
Dave.Martin@....com, james.morse@....com, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, akpm@...ux-foundation.org,
paulmck@...nel.org, rostedt@...dmis.org, Neeraj.Upadhyay@....com,
david@...hat.com, arnd@...db.de, fvdl@...gle.com, seanjc@...gle.com,
thomas.lendacky@....com, pawan.kumar.gupta@...ux.intel.com,
yosry.ahmed@...ux.dev, sohil.mehta@...el.com, xin@...or.com,
kai.huang@...el.com, xiaoyao.li@...el.com, peterz@...radead.org,
me@...aill.net, mario.limonciello@....com, xin3.li@...el.com,
ebiggers@...gle.com, ak@...ux.intel.com, chang.seok.bae@...el.com,
andrew.cooper3@...rix.com, perry.yuan@....com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 07/10] fs/resctrl: Add user interface to enable/disable
io_alloc feature
Hi Reinette,
On 7/21/2025 6:40 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/10/25 10:16 AM, Babu Moger wrote:
>> "io_alloc" feature in resctrl enables direct insertion of data from I/O
>> devices into the cache.
>>
>> Introduce user interface to enable/disable io_alloc feature.
>
> I think it is worth a mention *why* a user may want to disable this feature and
> why is not just always enabled. Here it can be highlighted that this feature
> may take resources (CLOSID) away from general (CPU) cache allocation and since
> this may be scarce enabling user to disable this feature supports different use cases.
>
Sure.
>>
>> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
>> Injection Allocation Enforcement). When enabled, SDCIAE directs all SDCI
>> lines to be placed into the L3 cache partitions specified by the register
>> corresponding to the highest CLOSID supported by the resource. With CDP
>> enabled, io_alloc routes I/O traffic using the highest CLOSID assigned to
>> the instruction cache (L3CODE).
>
> This is a lot of architecture specific text for a resctrl fs patch ... I think
> you are trying to motivate the resctrl fs implementation. Similar motivation
> as proposed for cover letter can be used here to help explain the implementation
> choices.
Updated the whole changelog.
fs/resctrl: Add user interface to enable/disable io_alloc feature
"io_alloc" feature in resctrl enables direct insertion of data from I/O
devices into the cache.
Introduce user interface to enable/disable io_alloc feature.
On AMD systems, when io_alloc is enabled, the highest CLOSID is reserved
exclusively for I/O allocation traffic and is no longer available for
general CPU cache allocation. This feature is disabled by default. Users
are encouraged to enable it only when running workloads that can benefit
from this functionality.
Since CLOSIDs are managed by resctrl fs, it is least invasive to make the
"io_alloc is supported by maximum supported CLOSID" part of the initial
resctrl fs support for io_alloc. Take care not to expose this use of
CLOSID for io_alloc to user space so that this is not required from other
architectures that may support io_alloc differently in the future.
Signed-off-by: Babu Moger <babu.moger@....com>
>
>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>
> ...
>
>> ---
>> Documentation/filesystems/resctrl.rst | 8 ++
>> fs/resctrl/rdtgroup.c | 149 +++++++++++++++++++++++++-
>> 2 files changed, 156 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 354e6a00fa45..189c1ccb92d6 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -157,6 +157,14 @@ related to allocation:
>> "not supported":
>> Support not available on the system.
>>
>> + The feature can be modified by writing to the interface, for example:
>> +
>> + To enable:
>> + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>> +
>> + To disable:
>> + # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
>> +
>> The underlying implementation may reduce resources available to
>> general (CPU) cache allocation. See architecture specific notes
>> below. Depending on usage requirements the feature can be enabled
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index d7c4417b4516..06c854caa55c 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
>> static char last_cmd_status_buf[512];
>>
>> static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
>> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>>
>> static void rdtgroup_destroy_root(void);
>>
>> @@ -232,6 +233,16 @@ bool closid_allocated(unsigned int closid)
>> return !test_bit(closid, closid_free_map);
>> }
>>
>> +static bool resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
>> +{
>> + return __test_and_clear_bit(io_alloc_closid, closid_free_map);
>> +}
>> +
>> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
>> +{
>> + closid_free(io_alloc_closid);
>> +}
>
> I do not think these should be helpers/wrappers with a separate
> namespace. It will make the code easier to understand if it is clear that
> the "io_alloc" CLOSID is allocated from the same "pool" as the CLOSID for
> control groups.
>
> I would thus propose a specific, for example closid_alloc_fixed(u32 closid)
> helper, and just call closid_free() directly.
>
Sure.
>> +
>> /**
>> * rdtgroup_mode_by_closid - Return mode of resource group with closid
>> * @closid: closid if the resource group
>> @@ -1030,6 +1041,16 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
>> return 0;
>> }
>>
>> +/*
>> + * resctrl_io_alloc_closid_supported() - io_alloc feature utilizes the
>> + * highest CLOSID value to direct I/O traffic. Ensure that io_alloc_closid
>> + * is in the supported range.
>> + */
>> +static bool resctrl_io_alloc_closid_supported(u32 io_alloc_closid)
>
> Please move to ctrlmondata.c
>
Sure.
>> +{
>> + return io_alloc_closid < closids_supported();
>> +}
>> +
>> /*
>> * resctrl_io_alloc_closid() - io_alloc feature routes I/O traffic using
>> * the highest available CLOSID. Retrieve the maximum CLOSID supported by the
>> @@ -1858,6 +1879,131 @@ static int resctrl_io_alloc_show(struct kernfs_open_file *of,
>> return 0;
>> }
>>
>> +/*
>> + * Initialize io_alloc CLOSID cache resource CBM with all usable (shared
>> + * and unused) cache portions.
>> + */
>> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
>
> Mixing two features (io_alloc and CAT) in the function name is confusing.
> How about resctrl_io_alloc_init_cbm() and move to ctrlmondata.c?
Yes.
>
>> + struct resctrl_schema *s, u32 closid)
>
> No need to provide resource as parameter, it can be determined from schema.
Sure.
>
>> +{
>> + int ret;
>> +
>> + rdt_staged_configs_clear();
>> +
>> + ret = rdtgroup_init_cat(s, closid);
>> + if (ret < 0)
>> + goto out;
>
> More below, but I think this flow can be simplified by moving the logic
> handling CDP here. If CDP is enabled for the resource then a successful
> rdtgroup_init_cat() can just be followed by a snippet that copies the
> staged config of the CDP type to the staged config of its peer type.
> Their CBMs are supposed to be identical so there is no reason for all the
> rdtgroup_init_cat() processing to be repeated. resctrl_arch_update_domains()
> can handle updating both in a single call.
Yes.
>
>> +
>> + ret = resctrl_arch_update_domains(r, closid);
>> +
>> +out:
>> + rdt_staged_configs_clear();
>> + return ret;
>> +}
>> +
>> +static const char *rdtgroup_name_by_closid(int closid)
>
> This seems generic enough and appropriate for rdtgroup.c
>
>> +{
>> + struct rdtgroup *rdtgrp;
>> +
>> + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
>> + if (rdtgrp->closid == closid)
>> + return rdt_kn_name(rdtgrp->kn);
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static struct resctrl_schema *resctrl_get_schema(enum resctrl_conf_type type)
>
> This also seems generic enough and appropriate for rdtgroup.c
>
>> +{
>> + struct resctrl_schema *schema;
>> +
>> + list_for_each_entry(schema, &resctrl_schema_all, list) {
>> + if (schema->conf_type == type)
>> + return schema;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
>> + size_t nbytes, loff_t off)
>> +{
>> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> + enum resctrl_conf_type peer_type;
>> + struct rdt_resource *r = s->res;
>> + struct resctrl_schema *peer_s;
>> + char const *grp_name;
>> + u32 io_alloc_closid;
>> + bool enable;
>> + int ret;
>> +
>> + ret = kstrtobool(buf, &enable);
>> + if (ret)
>> + return ret;
>> +
>> + cpus_read_lock();
>> + mutex_lock(&rdtgroup_mutex);
>> +
>> + rdt_last_cmd_clear();
>> +
>> + if (!r->cache.io_alloc_capable) {
>> + rdt_last_cmd_printf("io_alloc is not supported on %s\n", s->name);
>> + ret = -ENODEV;
>> + goto out_unlock;
>> + }
>> +
>> + io_alloc_closid = resctrl_io_alloc_closid(r);
>> + if (!resctrl_io_alloc_closid_supported(io_alloc_closid)) {
>> + rdt_last_cmd_printf("io_alloc CLOSID (ctrl_hw_id) %d is not available\n",
>> + io_alloc_closid);
>> + ret = -EINVAL;
>> + goto out_unlock;
>> + }
>> +
>> + /* If the feature is already up to date, no action is needed. */
>> + if (resctrl_arch_get_io_alloc_enabled(r) == enable)
>> + goto out_unlock;
>> +
>> + if (enable) {
>> + if (!resctrl_io_alloc_closid_alloc(io_alloc_closid)) {
>> + grp_name = rdtgroup_name_by_closid(io_alloc_closid);
>> + WARN_ON_ONCE(!grp_name);
>> + rdt_last_cmd_printf("CLOSID (ctrl_hw_id) %d for io_alloc is used by %s group\n",
>> + io_alloc_closid, grp_name ? grp_name : "another");
>> + ret = -ENOSPC;
>> + goto out_unlock;
>> + }
>> +
>> + /* Initialize schema for both CDP_DATA and CDP_CODE when CDP is enabled */
>> + if (resctrl_arch_get_cdp_enabled(r->rid)) {
>
> I think this block can be removed to simplify the flow by moving the CDP handling to
> resctrl_io_alloc_init_cat().
Sure. Also moved the whole function to fs/resctrl/ctrlmondata.c.
>
>> + peer_type = resctrl_peer_type(s->conf_type);
>> + peer_s = resctrl_get_schema(peer_type);
>> + if (peer_s)
>> + ret = resctrl_io_alloc_init_cat(r, peer_s, io_alloc_closid);
>> + }
>> +
>> + if (!ret)
>> + ret = resctrl_io_alloc_init_cat(r, s, io_alloc_closid);
>> +
>> + if (ret) {
>> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
>> + resctrl_io_alloc_closid_free(io_alloc_closid);
>> + goto out_unlock;
>> + }
>> +
>> + } else {
>> + resctrl_io_alloc_closid_free(io_alloc_closid);
>> + }
>> +
>> + ret = resctrl_arch_io_alloc_enable(r, enable);
>> +
>> +out_unlock:
>> + mutex_unlock(&rdtgroup_mutex);
>> + cpus_read_unlock();
>> +
>> + return ret ?: nbytes;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>> @@ -1950,9 +2096,10 @@ static struct rftype res_common_files[] = {
>> },
>> {
>> .name = "io_alloc",
>> - .mode = 0444,
>> + .mode = 0644,
>> .kf_ops = &rdtgroup_kf_single_ops,
>> .seq_show = resctrl_io_alloc_show,
>> + .write = resctrl_io_alloc_write,
>> },
>> {
>> .name = "max_threshold_occupancy",
>
> Reinette
>
Thanks
Babu
Powered by blists - more mailing lists