lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd0585f6-1d90-4ef1-9a10-7db50cb879ea@intel.com>
Date: Fri, 13 Sep 2024 13:51:33 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>
CC: <fenghua.yu@...el.com>, <hpa@...or.com>, <paulmck@...nel.org>,
	<thuth@...hat.com>, <xiongwei.song@...driver.com>, <ardb@...nel.org>,
	<pawan.kumar.gupta@...ux.intel.com>, <daniel.sneddon@...ux.intel.com>,
	<sandipan.das@....com>, <kai.huang@...el.com>, <peterz@...radead.org>,
	<kan.liang@...ux.intel.com>, <pbonzini@...hat.com>, <xin3.li@...el.com>,
	<ebiggers@...gle.com>, <alexandre.chartre@...cle.com>, <perry.yuan@....com>,
	<tan.shaopeng@...itsu.com>, <james.morse@....com>, <tony.luck@...el.com>,
	<maciej.wieczor-retman@...el.com>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <peternewman@...gle.com>,
	<eranian@...gle.com>
Subject: Re: [PATCH 5/7] x86/resctrl: Add interface to enable/disable SDCIAE

Hi Babu,

On 8/16/24 9:16 AM, Babu Moger wrote:
> The SDCIAE (SDCI Allocation Enforcement) PQE feature allows system software
> to configure the portion of the L3 cache used for SDCI.
> 
> When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache
> partitions identified by the highest-supported L3_MASK_n register as
> reported by CPUID Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15,
> SDCI lines will be allocated into the L3 cache partitions determined by
> the bitmask in the L3_MASK_15 register.
> 
> Introduce interface to enable/disable SDCIAE feature on user input.
> 
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
>   Documentation/arch/x86/resctrl.rst     | 22 +++++++
>   arch/x86/kernel/cpu/resctrl/core.c     |  1 +
>   arch/x86/kernel/cpu/resctrl/internal.h |  1 +
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 88 ++++++++++++++++++++++++++
>   4 files changed, 112 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index a824affd741d..cb1532dd843f 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -135,6 +135,28 @@ related to allocation:
>   			"1":
>   			      Non-contiguous 1s value in CBM is supported.
>   
> +"sdciae":
> +		Indicates if the system can support SDCIAE (L3 Smart Data Cache
> +		Injection Allocation Enforcement) feature.
> +
> +		Smart Data Cache Injection (SDCI) is a mechanism that enables
> +		direct insertion of data from I/O devices into the L3 cache.
> +		By directly caching data from I/O devices rather than first
> +		storing the I/O data in DRAM, SDCI reduces demands on DRAM
> +		bandwidth and reduces latency to the processor consuming the
> +		I/O data. The SDCIAE feature allows system software to configure
> +		limit the portion of the L3 cache used for SDCI.

Above needs to change to a generic resctrl fs feature.

> +
> +			"0":
> +			      Feature is not enabled.
> +			"1":
> +			      Feature is enabled.

What does "feature is enabled" and "feature is not enabled" mean to the user?
What can the user expect to happen after enabling/disabling this feature?

> +
> +		Feature can be enabled/disabled by writing to the interface.
> +		Example::
> +
> +			# echo 1 > /sys/fs/resctrl/info/L3/sdciae
> +
>   Memory bandwidth(MB) subdirectory contains the following files
>   with respect to allocation:
>   
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index e4381e3feb75..6a9512008a4a 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -299,6 +299,7 @@ static void rdt_get_cdp_config(int level)
>   static void rdt_get_sdciae_alloc_cfg(struct rdt_resource *r)
>   {
>   	r->sdciae_capable = true;
> +	resctrl_sdciae_rftype_init();
>   }
>   
>   static void rdt_get_cdp_l3_config(void)
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index ceb0e8e1ed76..9a3da6d49144 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -662,6 +662,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>   void __init thread_throttle_mode_init(void);
>   void __init mbm_config_rftype_init(const char *config);
>   void rdt_staged_configs_clear(void);
> +void __init resctrl_sdciae_rftype_init(void);
>   bool closid_allocated(unsigned int closid);
>   int resctrl_find_cleanest_closid(void);
>   
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index c62d6183bfe4..58e4df195207 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -171,6 +171,27 @@ void closid_free(int closid)
>   	__set_bit(closid, &closid_free_map);
>   }
>   
> +/*
> + * SDCIAE feature uses max CLOSID to route the SDCI traffic.
> + * Get the max CLOSID number
> + */
> +static u32 get_sdciae_closid(struct rdt_resource *r)
> +{
> +	return resctrl_arch_get_num_closid(r) - 1;
> +}
> +
> +static int closid_alloc_sdciae(struct rdt_resource *r)
> +{
> +	u32 sdciae_closid = get_sdciae_closid(r);
> +
> +	if (closid_free_map & (1 << sdciae_closid)) {
> +		closid_free_map &= ~(1 << sdciae_closid);
> +		return sdciae_closid;
> +	} else {
> +		return -ENOSPC;
> +	}
> +}

How does this interact with CDP? It seems to me that the above would
cause overflow on a CDP system since the closid_free_map is sized to
half of what the hardware supports. This also seems to still allow
creating resource groups that may end up using the CLOSID dedicated
to SDCIAE here since the (when CDP enabled) resource groups use
software closid and then hardware configuration is done with the
hardware closid. When CDP is enabled it seems possible to still
create a resource group and modify the CBM of what is then intended to
be sdciae allocations?

Also, please be consistent with function naming, note how the above
two functions differ wrt namespace and verb. resctrl is surely not
consistent in this regard but it really helps to have partner functions
look similar. For example, this could be "feature_closid_get()" and
"feature_closid_alloc()".

Also, there looks to be opportunity to use bitops here ... perhaps
__test_and_clear_bit()?


> +
>   /**
>    * closid_allocated - test if provided closid is in use
>    * @closid: closid to be tested
> @@ -1850,6 +1871,57 @@ int resctrl_arch_set_sdciae_enabled(enum resctrl_res_level l, bool enable)
>   	return 0;
>   }
>   
> +static int resctrl_sdciae_show(struct kernfs_open_file *of,
> +			       struct seq_file *seq, void *v)
> +{
> +	seq_printf(seq, "%x\n", resctrl_arch_get_sdciae_enabled(RDT_RESOURCE_L3));
> +	return 0;
> +}

This does not look right ... this file has flags "RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE"
which means that it will be created for all CAT resources. So, on a system that supports
L2 CAT, the "sdciae" file will be created for the L2 resource and it will show whether
"sdciae" is enabled on the *L3* resource?

> +
> +static ssize_t resctrl_sdciae_write(struct kernfs_open_file *of, char *buf,
> +				    size_t nbytes, loff_t off)
> +{
> +	struct resctrl_schema *s = of->kn->parent->priv;
> +	struct rdt_resource *r = s->res;
> +	unsigned int enable;
> +	u32 sdciae_closid;
> +	int ret;
> +
> +	if (!r->sdciae_capable)
> +		return -EINVAL;
> +
> +	ret = kstrtouint(buf, 0, &enable);

How about kstrtobool()? This will make things more consistent with
resctrl_arch_set_sdciae_enabled() expecting a bool. Or should we be looking ahead
at this file later possibly needing to support more integers to activate more capabilities
related to this feature? In that case this implementation needs to take care since instead
of supporting "0" and "1" it supports "0" and "anything but 0" that prevents any such
future enhancements.


> +	if (ret)
> +		return ret;
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +
> +	/* Update the MSR only when there is a change */

The resctrl fs cannot make predictions on what arch code needs to do to enable feature.

> +	if (resctrl_arch_get_sdciae_enabled(RDT_RESOURCE_L3) != enable) {

(same issue with this file being present under L2 resource creating confusion)

> +		if (enable) {
> +			ret = closid_alloc_sdciae(r);
> +			if (ret < 0) {
> +				rdt_last_cmd_puts("SDCIAE CLOSID is not available\n");
> +				goto out_sdciae;
> +			}
> +		} else {
> +			sdciae_closid = get_sdciae_closid(r);
> +			closid_free(sdciae_closid);
> +		}


> +
> +		ret = resctrl_arch_set_sdciae_enabled(RDT_RESOURCE_L3, enable);

I assume that once SDCIAE is enabled the I/O traffic will start flowing to whatever
was the last CBM of the max CLOSID? Is this intended or should there be some default
CBM that this feature should start with?

> +	}
> +
> +out_sdciae:
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret ?: nbytes;
> +}
> +
>   /* rdtgroup information files for one cache resource. */
>   static struct rftype res_common_files[] = {
>   	{
> @@ -2002,6 +2074,13 @@ static struct rftype res_common_files[] = {
>   		.seq_show	= rdtgroup_schemata_show,
>   		.fflags		= RFTYPE_CTRL_BASE,
>   	},
> +	{
> +		.name		= "sdciae",
> +		.mode		= 0644,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= resctrl_sdciae_show,
> +		.write		= resctrl_sdciae_write,
> +	},
>   	{
>   		.name		= "mode",
>   		.mode		= 0644,
> @@ -2101,6 +2180,15 @@ void __init mbm_config_rftype_init(const char *config)
>   		rft->fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE;
>   }
>   
> +void __init resctrl_sdciae_rftype_init(void)
> +{
> +	struct rftype *rft;
> +
> +	rft = rdtgroup_get_rftype_by_name("sdciae");
> +	if (rft)
> +		rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE;
> +}
> +

Another instance of the pattern that is impacted by the ABMC and MPAM work.

>   /**
>    * rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
>    * @r: The resource group with which the file is associated.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ