lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d0bf9fc-f100-45f4-b266-47786f1b8aa3@intel.com>
Date: Fri, 12 Jul 2024 15:13:06 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
	<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
	<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
	<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
	<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
	<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
	<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
	<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
	<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
	<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [PATCH v5 12/20] x86/resctrl: Add data structures and definitions
 for ABMC assignment

Hi Babu,

On 7/3/24 2:48 PM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID and monitor the bandwidth as long as the counter
> is assigned. The bandwidth events will be tracked by the hardware until
> the user changes the configuration. Each resctrl group can configure
> maximum two counters, one for total event and one for local event.
> 
> The counters are configured by writing to MSR L3_QOS_ABMC_CFG.
> Configuration is done by setting the counter id, bandwidth source (RMID)
> and bandwidth configuration supported by BMEC(Bandwidth Monitoring Event
> Configuration). Reading L3_QOS_ABMC_DSC returns the configuration of the
> counter id specified in L3_QOS_ABMC_CFG.
> 
> Attempts to read or write these MSRs when ABMC is not enabled will result
> in a #GP(0) exception.
> 
> Introduce data structures and definitions for ABMC assignments.
> 
> MSR L3_QOS_ABMC_CFG (0xC000_03FDh) and L3_QOS_ABMC_DSC (0xC000_03FEh)
> details.
> =========================================================================
> Bits 	Mnemonic	Description			Access Reset
> 							Type   Value
> =========================================================================
> 63 	CfgEn 		Configuration Enable 		R/W 	0
> 
> 62 	CtrEn 		Enable/disable Tracking		R/W 	0
> 
> 61:53 	– 		Reserved 			MBZ 	0
> 
> 52:48 	CtrID 		Counter Identifier		R/W	0
> 
> 47 	IsCOS		BwSrc field is a CLOSID		R/W	0
> 			(not an RMID)
> 
> 46:44 	–		Reserved			MBZ	0
> 
> 43:32	BwSrc		Bandwidth Source		R/W	0
> 			(RMID or CLOSID)
> 
> 31:0	BwType		Bandwidth configuration		R/W	0
> 			to track for this counter
> ==========================================================================
> 
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).

The changelog only describes the hardware interface yet the patch contains
part hardware interface part new driver support for hardware interface.

> 
> Signed-off-by: Babu Moger <babu.moger@....com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v5: Moved assignment flags here (path 10/19 of v4).
>      Added MON_CNTR_UNSET definition to initialize cntr_id's.
>      More details in commit log.
>      Renamed few fields in l3_qos_abmc_cfg for readability.
> 
> v4: Added more descriptions.
>      Changed the name abmc_ctr_id to ctr_id.
>      Added L3_QOS_ABMC_DSC. Used for reading the configuration.
> 
> v3: No changes.
> 
> v2: No changes.
> ---
>   arch/x86/include/asm/msr-index.h       |  2 ++
>   arch/x86/kernel/cpu/resctrl/internal.h | 40 ++++++++++++++++++++++++++
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++
>   3 files changed, 60 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 263b2d9d00ed..5e44ff91f459 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1175,6 +1175,8 @@
>   #define MSR_IA32_SMBA_BW_BASE		0xc0000280
>   #define MSR_IA32_EVT_CFG_BASE		0xc0000400
>   #define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
> +#define MSR_IA32_L3_QOS_ABMC_CFG	0xc00003fd
> +#define MSR_IA32_L3_QOS_ABMC_DSC	0xc00003fe
>   
>   /* MSR_IA32_VMX_MISC bits */
>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 4cb1a5d014a3..6925c947682d 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -100,6 +100,18 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
>   /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
>   #define ABMC_ENABLE			BIT(0)
>   
> +/*
> + * Assignment flags for ABMC feature
> + */
> +#define ASSIGN_NONE	0
> +#define ASSIGN_TOTAL	BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
> +#define ASSIGN_LOCAL	BIT(QOS_L3_MBM_LOCAL_EVENT_ID)

These flags do not appear to be part of hardware interface and there
is no explanation for what they mean or how they will be used. They are
also not used in this patch. It is thus not possible to understand if
they belong in this patch or is appropriate in this work.

> +
> +#define MON_CNTR_UNSET	U32_MAX
> +
> +/* Maximum assignable counters per resctrl group */
> +#define MAX_CNTRS	2
> +
>   struct rdt_fs_context {
>   	struct kernfs_fs_context	kfc;
>   	bool				enable_cdpl2;
> @@ -228,12 +240,14 @@ enum rdtgrp_mode {
>    * @parent:			parent rdtgrp
>    * @crdtgrp_list:		child rdtgroup node list
>    * @rmid:			rmid for this rdtgroup
> + * @cntr_id:			ABMC counter ids assigned to this group

struct mongroup is private to resctrl fs so it cannot contain an
architecture specific feature. Having it contain a generic "cntr_id"
may be ok at this point, but it should not be termed "ABMC counter".

>    */
>   struct mongroup {
>   	struct kernfs_node	*mon_data_kn;
>   	struct rdtgroup		*parent;
>   	struct list_head	crdtgrp_list;
>   	u32			rmid;
> +	u32			cntr_id[MAX_CNTRS];

This is a significant addition yet is silently included as part of a patch
that just introduces hardware interface. This is how resctrl will manage
the hardware counters. It is significant since this is what dictates that it
is resctrl fs that will manage the counters, which makes it important which
interfaces are made available and from where it is called. Through
this series I have also not come across a description of this architecture.
With this introduction counters are maintained per monitor group, yet
the new interface supports assigining counters per domain. There
is no high level explanation of this architecture and the reader is forced
to decipher it from the implementation making this work harder to review
that necessary.

Would it be possible to present the fs and architecture code
separately? I think doing so will make it easier to understand.

>   };
>   
>   /**
> @@ -607,6 +621,32 @@ union cpuid_0x10_x_edx {
>   	unsigned int full;
>   };
>   
> +/*
> + * ABMC counters can be configured by writing to L3_QOS_ABMC_CFG.
> + * @bw_type		: Bandwidth configuration(supported by BMEC)
> + *			  to track this counter id.

Does "to track this counter id" mean "tracked by @cntr_id"?

> + * @bw_src		: Bandwidth Source (RMID or CLOSID).

Please do not capitalize words mid sentence, like "Source"
above, "Identifier", and "Enable" in two instances below.

> + * @reserved1		: Reserved.
> + * @is_clos		: BwSrc field is a CLOSID (not an RMID).

Just stick to @bw_src.

> + * @cntr_id		: Counter Identifier.
> + * @reserved		: Reserved.
> + * @cntr_en		: Tracking Enable bit.

Can this be more detailed about what happens when this bit is set/clear?

> + * @cfg_en		: Configuration Enable bit.

What is difference between "configuration enable" and "tracking enable"?
What is relationship, if any, to @bw_type that is the bandwidth configuration?

> + */
> +union l3_qos_abmc_cfg {
> +	struct {
> +		unsigned long	bw_type	:32,
> +				bw_src	:12,
> +				reserved1: 3,
> +				is_clos	: 1,
> +				cntr_id	: 5,
> +				reserved : 9,
> +				cntr_en	: 1,
> +				cfg_en	: 1;
> +	} split;

Please check the spacing in this data structure. Tabs are used inconsistently
and the members are not lining up either.

> +	unsigned long full;
> +};
> +
>   void rdt_last_cmd_clear(void);
>   void rdt_last_cmd_puts(const char *s);
>   __printf(1, 2)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 91c5d45ac367..d2663f1345b7 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2505,6 +2505,7 @@ static void resctrl_abmc_set_one_amd(void *arg)
>   
>   static int _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>   {
> +	struct rdtgroup *prgrp, *crgrp;
>   	struct rdt_mon_domain *d;
>   
>   	/*
> @@ -2513,6 +2514,17 @@ static int _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>   	 */
>   	mbm_cntrs_init();
>   
> +	/* Reset the cntr_id's for all the monitor groups */
> +	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
> +		prgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
> +		prgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
> +		list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list,
> +				    mon.crdtgrp_list) {
> +			crgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
> +			crgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
> +		}
> +	}
> +

No. The counters are in the monitor group that is a structure that is private
to the fs. The architecture code should not be accessing it. This should only be
done by fs code.

>   	/*
>   	 * Hardware counters will reset after switching the monitor mode.
>   	 * Reset the architectural state so that reading of hardware
> @@ -3573,6 +3585,8 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
>   		return ret;
>   	}
>   	rdtgrp->mon.rmid = ret;
> +	rdtgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
> +	rdtgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
>   
>   	ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
>   	if (ret) {
> @@ -4128,6 +4142,10 @@ static void __init rdtgroup_setup_default(void)
>   	rdtgroup_default.closid = RESCTRL_RESERVED_CLOSID;
>   	rdtgroup_default.mon.rmid = RESCTRL_RESERVED_RMID;
>   	rdtgroup_default.type = RDTCTRL_GROUP;
> +
> +	rdtgroup_default.mon.cntr_id[0] = MON_CNTR_UNSET;
> +	rdtgroup_default.mon.cntr_id[1] = MON_CNTR_UNSET;
> +
>   	INIT_LIST_HEAD(&rdtgroup_default.mon.crdtgrp_list);
>   
>   	list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ