[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d18dc408-0a05-47b4-9126-19a7bd5fff6b@intel.com>
Date: Wed, 17 Sep 2025 22:37:33 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tony.luck@...el.com>,
<Dave.Martin@....com>, <james.morse@....com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <kas@...nel.org>,
<rick.p.edgecombe@...el.com>, <akpm@...ux-foundation.org>,
<paulmck@...nel.org>, <pmladek@...e.com>,
<pawan.kumar.gupta@...ux.intel.com>, <rostedt@...dmis.org>,
<kees@...nel.org>, <arnd@...db.de>, <fvdl@...gle.com>, <seanjc@...gle.com>,
<thomas.lendacky@....com>, <manali.shukla@....com>, <perry.yuan@....com>,
<sohil.mehta@...el.com>, <xin@...or.com>, <peterz@...radead.org>,
<mario.limonciello@....com>, <gautham.shenoy@....com>, <nikunj@....com>,
<dapeng1.mi@...ux.intel.com>, <ak@...ux.intel.com>,
<chang.seok.bae@...el.com>, <ebiggers@...gle.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>
Subject: Re: [PATCH v9 06/10] fs/resctrl: Add user interface to enable/disable
io_alloc feature
Hi Babu,
On 9/2/25 3:41 PM, Babu Moger wrote:
> "io_alloc" feature in resctrl enables direct insertion of data from I/O
> devices into the cache.
(repetition)
>
> On AMD systems, when io_alloc is enabled, the highest CLOSID is reserved
> exclusively for I/O allocation traffic and is no longer available for
> general CPU cache allocation. Users are encouraged to enable it only when
> running workloads that can benefit from this functionality.
>
> Since CLOSIDs are managed by resctrl fs, it is least invasive to make the
> "io_alloc is supported by maximum supported CLOSID" part of the initial
> resctrl fs support for io_alloc. Take care not to expose this use of CLOSID
> for io_alloc to user space so that this is not required from other
> architectures that may support io_alloc differently in the future.
>
> Introduce user interface to enable/disable io_alloc feature. Check to
> verify the availability of CLOSID reserved for io_alloc, and initialize
> the CLOSID with a usable CBMs across all the domains.
I think the flow will improve if above two paragraphs are swapped. This is
also missing the non-obvious support for CDP. As mentioned in previous patch, if
the related doc change is moved from patch 5 to here it can be handled together.
Trying to put it all together, please feel free to improve:
AMD's SDCIAE forces all SDCI lines to be placed into the L3 cache portions
identified by the highest-supported L3_MASK_n register, where n is the maximum
supported CLOSID.
To support AMD's SDCIAE, when io_alloc resctrl feature is enabled, reserve the
highest CLOSID exclusively for I/O allocation traffic making it no longer available for
general CPU cache allocation.
Introduce user interface to enable/disable io_alloc feature and encourage users
to enable io_alloc only when running workloads that can benefit from this
functionality. On enable, initialize the io_alloc CLOSID with all usable CBMs
across all the domains.
Since CLOSIDs are managed by resctrl fs, it is least invasive to make
"io_alloc is supported by maximum supported CLOSID" part of the initial
resctrl fs support for io_alloc. Take care to minimally (only in error messages)
expose this use of CLOSID for io_alloc to user space so that this is
not required from other architectures that may support io_alloc differently in the future.
When resctrl is mounted with "-o cdp" to enable code/data prioritization
there are two L3 resources that can support I/O allocation: L3CODE and L3DATA.
From resctrl fs perspective the two resources share a CLOSID and the
architecture's available CLOSID are halved to support this.
The architecture's underlying CLOSID used by SDCIAE when CDP is enabled is
the CLOSID associated with the L3CODE resource, but from resctrl's perspective
there is only one CLOSID for both L3CODE and L3DATA. L3DATA is thus not usable
for general (CPU) cache allocation nor I/O allocation. Keep the L3CODE and
L3DATA I/O alloc status in sync to avoid any confusion to user space. That
is, enabling io_alloc on L3CODE does so on L3DATA and vice-versa, and
keep the I/O allocation CBMs of L3CODE and L3DATA in sync.
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
...
> +ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + struct rdt_resource *r = s->res;
> + char const *grp_name;
> + u32 io_alloc_closid;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret)
> + return ret;
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> +
> + if (!r->cache.io_alloc_capable) {
> + rdt_last_cmd_printf("io_alloc is not supported on %s\n", s->name);
> + ret = -ENODEV;
> + goto out_unlock;
> + }
> +
> + /* If the feature is already up to date, no action is needed. */
> + if (resctrl_arch_get_io_alloc_enabled(r) == enable)
> + goto out_unlock;
> +
> + io_alloc_closid = resctrl_io_alloc_closid(r);
> + if (!resctrl_io_alloc_closid_supported(io_alloc_closid)) {
> + rdt_last_cmd_printf("io_alloc CLOSID (ctrl_hw_id) %d is not available\n",
%d -> %u ?
> + io_alloc_closid);
> + ret = -EINVAL;
> + goto out_unlock;
> + }
> +
> + if (enable) {
> + if (!closid_alloc_fixed(io_alloc_closid)) {
> + grp_name = rdtgroup_name_by_closid(io_alloc_closid);
> + WARN_ON_ONCE(!grp_name);
> + rdt_last_cmd_printf("CLOSID (ctrl_hw_id) %d for io_alloc is used by %s group\n",
%d -> %u ?
> + io_alloc_closid, grp_name ? grp_name : "another");
> + ret = -ENOSPC;
> + goto out_unlock;
> + }
> +
> + ret = resctrl_io_alloc_init_cbm(s, io_alloc_closid);
> + if (ret) {
> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
> + closid_free(io_alloc_closid);
> + goto out_unlock;
> + }
> + } else {
> + closid_free(io_alloc_closid);
> + }
> +
> + ret = resctrl_arch_io_alloc_enable(r, enable);
> +
> +out_unlock:
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> +
> + return ret ?: nbytes;
> +}
Reinette
Powered by blists - more mailing lists