[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6f7064af-0680-4b5d-9165-1687137eaba9@intel.com>
Date: Mon, 23 Dec 2024 13:37:54 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <tglx@...utronix.de>, <mingo@...hat.com>,
<bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <fenghua.yu@...el.com>, <x86@...nel.org>, <hpa@...or.com>,
<akpm@...ux-foundation.org>, <paulmck@...nel.org>, <thuth@...hat.com>,
<rostedt@...dmis.org>, <xiongwei.song@...driver.com>,
<pawan.kumar.gupta@...ux.intel.com>, <jpoimboe@...nel.org>,
<daniel.sneddon@...ux.intel.com>, <thomas.lendacky@....com>,
<perry.yuan@....com>, <sandipan.das@....com>, <kai.huang@...el.com>,
<seanjc@...gle.com>, <xin3.li@...el.com>, <ebiggers@...gle.com>,
<andrew.cooper3@...rix.com>, <mario.limonciello@....com>,
<tan.shaopeng@...itsu.com>, <james.morse@....com>, <tony.luck@...el.com>,
<peternewman@...gle.com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <eranian@...gle.com>, <corbet@....net>
Subject: Re: [PATCH v2 5/7] x86/resctrl: Add interface to enable/disable
io_alloc feature
Hi Babu,
On 12/18/24 1:38 PM, Babu Moger wrote:
> The io_alloc feature in resctrl enables system software to configure
> the portion of the L3 cache allocated for I/O traffic.
>
Above is about resctrl feature.
> Smart Data Cache Injection (SDCI) is a mechanism that allows direct
> insertion of data from I/O devices into the L3 cache. By caching I/O
> data directly in the L3 cache, instead of writing it to DRAM first,
> SDCI reduces DRAM bandwidth usage and lowers latency for the processor
> consuming the I/O data.
>
> When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache
> partitions identified by the highest-supported L3_MASK_n register as
> reported by CPUID Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15,
> SDCI lines will be allocated into the L3 cache partitions determined by
> the bitmask in the L3_MASK_15 register.
Above is about AMD feature.
>
> Introduce interface to enable/disable "io_alloc" feature on user input.
Back to resctrl feature.
Please do not jump from resctrl to AMD feature in a way that makes it seem that
they are interchangeable. To help with this you could use similar style as in
ABMC where the text flows like:
<resctrl feature description>.
On AMD <resctrl feature> is backed by <AMD feature> that <AMD feature details>.
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
> v2: Renamed the feature to "io_alloc".
> Added generic texts for the feature in commit log and resctrl.rst doc.
> Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
> values when enabled.
> Fixed io_alloc show functinality to display only on L3 resource.
> ---
> Documentation/arch/x86/resctrl.rst | 27 ++++++
> arch/x86/kernel/cpu/resctrl/core.c | 2 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 118 +++++++++++++++++++++++++
> 3 files changed, 147 insertions(+)
>
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 6768fc1fad16..52679175ee14 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -135,6 +135,33 @@ related to allocation:
> "1":
> Non-contiguous 1s value in CBM is supported.
>
> +"io_alloc":
> + The "io_alloc" feature in resctrl enables system software to
> + configure the portion of the L3 cache allocated for I/O traffic.
> +
> + Smart Data Cache Injection (SDCI) is a mechanism that allows
> + direct insertion of data from I/O devices into the L3 cache.
> + By caching I/O data directly in the L3 cache, instead of writing
> + it to DRAM first, SDCI reduces DRAM bandwidth usage and lowers
> + latency for the processor consuming the I/O data.
> +
> + When enabled the feature forces all SDCI lines to be placed
> + into the L3 cache partitions identified by the highest-supported
> + CLOSID (num_closids-1). This CLOSID will not be available to the
> + resctrl group.
Same comment as V1. The above two paragraphs cannot be guaranteed to be
specific to the "io_alloc" feature ... it is only specific to SDCIAE.
> +
> + "0":
> + I/O device L3 cache control is not enabled.
> + "1":
> + I/O device L3 cache control is enabled, allowing users
> + to manage the portions of the L3 cache allocated for
> + the I/O device.
> +
> + Feature can be enabled/disabled by writing to the interface.
> + Example::
> +
> + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
Similar to comment of V1 there is no information about what user can
expect when enabling this. For example, if this fails then one cause may
be that a resource group already owns that CLOSID and that removing that resource
group would make it possible to enable this feature. Even so, user space does not
know about CLOSIDs, only resource groups, making it difficult to correct without
more help.
> +
> Memory bandwidth(MB) subdirectory contains the following files
> with respect to allocation:
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 39e110033d96..066a7997eaf1 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -309,6 +309,8 @@ static void rdt_get_cdp_config(int level)
> static void rdt_get_sdciae_alloc_cfg(struct rdt_resource *r)
> {
> r->cache.io_alloc_capable = true;
> + resctrl_file_fflags_init("io_alloc",
> + RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
> }
>
> static void rdt_get_cdp_l3_config(void)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 398f241b65d5..e30731ce9335 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -62,6 +62,7 @@ static char last_cmd_status_buf[512];
>
> static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
> static void rdtgroup_destroy_root(void);
> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>
> struct dentry *debugfs_resctrl;
>
> @@ -180,6 +181,25 @@ void closid_free(int closid)
> __set_bit(closid, &closid_free_map);
> }
>
> +/*
> + * io_alloc (SDCIAE) feature uses max CLOSID to route the SDCI traffic.
Please do not use io_alloc and SDCIAE interchangeably.
> + * Get the max CLOSID number
> + */
> +static u32 resctrl_io_alloc_closid_get(struct rdt_resource *r)
> +{
> + return resctrl_arch_get_num_closid(r) - 1;
> +}
> +
> +static int resctrl_io_alloc_closid_alloc(struct rdt_resource *r)
> +{
> + u32 io_alloc_closid = resctrl_io_alloc_closid_get(r);
> +
> + if (__test_and_clear_bit(io_alloc_closid, &closid_free_map))
> + return io_alloc_closid;
> + else
> + return -ENOSPC;
> +}
This does not look right. The way resctrl manages CLOSID is to use the
*minimum* of all CLOSID supported across all resources. It may thus be possible
for the L3 resource to support more CLOSID than other resources causing
the closid_free_map to be sized to a value smaller than the L3 max CLOSID.
The bit being tested/cleared here may thus exceed what is in the bitmap.
Also, during V1 we discussed how CDP was not handled and I am not able to
see where/if it is handled in this version.
> +
> /**
> * closid_allocated - test if provided closid is in use
> * @closid: closid to be tested
> @@ -1832,6 +1852,97 @@ int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
> return 0;
> }
>
> +static int resctrl_io_alloc_show(struct kernfs_open_file *of,
> + struct seq_file *seq, void *v)
> +{
> + struct resctrl_schema *s = of->kn->parent->priv;
> + struct rdt_resource *r = s->res;
> +
> + seq_printf(seq, "%x\n", resctrl_arch_get_io_alloc_enabled(r->rid));
> + return 0;
> +}
> +
> +/*
> + * Initialize the io_alloc feature default when enabled
It is not clear what this comment describes.
> + */
> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r, u32 closid)
> +{
> + struct resctrl_schema *s;
> + int ret = 0;
> +
> + rdt_staged_configs_clear();
> +
> + list_for_each_entry(s, &resctrl_schema_all, list) {
> + r = s->res;
> + if (r->rid == RDT_RESOURCE_L3) {
It looks like the function ignores the resource provided to it via function
parameter and instead uses internal hardcode of which resource to act on?
> + ret = rdtgroup_init_cat(s, closid);
> + if (ret < 0)
> + goto out_init_cat;
> +
> + ret = resctrl_arch_update_domains(r, closid);
> + if (ret < 0)
> + goto out_init_cat;
> + }
> + }
> +
> +out_init_cat:
> + if (ret)
> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
> +
> + rdt_staged_configs_clear();
> + return ret;
> +}
> +
> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + struct resctrl_schema *s = of->kn->parent->priv;
> + struct rdt_resource *r = s->res;
> + u32 io_alloc_closid;
> + bool enable;
> + int ret;
> +
> + if (!r->cache.io_alloc_capable)
> + return -EINVAL;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret)
> + return ret;
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> +
> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
> +
> + if (resctrl_arch_get_io_alloc_enabled(r->rid) != enable) {
> + if (enable) {
> + ret = resctrl_io_alloc_closid_alloc(r);
> + if (ret < 0) {
> + rdt_last_cmd_puts("io_alloc CLOSID is not available\n");
Can this be more useful to the user? The user does not know what the CLOSID is nor
what can be remedied to fix this. What if the message instead contains the name of
the resource group to which the CLOSID is assigned so that user knows which resource
group could be removed to be able to enable io_alloc?
> + goto out_io_alloc;
> + }
> + ret = resctrl_io_alloc_init_cat(r, io_alloc_closid);
> + if (ret) {
> + closid_free(io_alloc_closid);
Could you please make a resctrl_io_alloc_closid_free() that is symmetrical to
resctrl_io_alloc_closid_alloc()?
> + goto out_io_alloc;
> + }
> +
> + } else {
> + closid_free(io_alloc_closid);
> + }
> +
> + ret = resctrl_arch_io_alloc_enable(r, enable);
> + }
> +
> +out_io_alloc:
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> +
> + return ret ?: nbytes;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -1984,6 +2095,13 @@ static struct rftype res_common_files[] = {
> .seq_show = rdtgroup_schemata_show,
> .fflags = RFTYPE_CTRL_BASE,
> },
> + {
> + .name = "io_alloc",
> + .mode = 0644,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = resctrl_io_alloc_show,
> + .write = resctrl_io_alloc_write,
> + },
> {
> .name = "mba_MBps_event",
> .mode = 0644,
Reinette
Powered by blists - more mailing lists