lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18ffa140-f7f9-450c-89d9-c833ff5763cb@amd.com>
Date: Tue, 16 Jul 2024 15:24:55 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
 tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
 kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
 jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
 rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
 jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
 maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v5 12/20] x86/resctrl: Add data structures and definitions
 for ABMC assignment

Hi Reinette,

On 7/12/24 17:13, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/3/24 2:48 PM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID and monitor the bandwidth as long as the counter
>> is assigned. The bandwidth events will be tracked by the hardware until
>> the user changes the configuration. Each resctrl group can configure
>> maximum two counters, one for total event and one for local event.
>>
>> The counters are configured by writing to MSR L3_QOS_ABMC_CFG.
>> Configuration is done by setting the counter id, bandwidth source (RMID)
>> and bandwidth configuration supported by BMEC(Bandwidth Monitoring Event
>> Configuration). Reading L3_QOS_ABMC_DSC returns the configuration of the
>> counter id specified in L3_QOS_ABMC_CFG.
>>
>> Attempts to read or write these MSRs when ABMC is not enabled will result
>> in a #GP(0) exception.
>>
>> Introduce data structures and definitions for ABMC assignments.
>>
>> MSR L3_QOS_ABMC_CFG (0xC000_03FDh) and L3_QOS_ABMC_DSC (0xC000_03FEh)
>> details.
>> =========================================================================
>> Bits     Mnemonic    Description            Access Reset
>>                             Type   Value
>> =========================================================================
>> 63     CfgEn         Configuration Enable         R/W     0
>>
>> 62     CtrEn         Enable/disable Tracking        R/W     0
>>
>> 61:53     –         Reserved             MBZ     0
>>
>> 52:48     CtrID         Counter Identifier        R/W    0
>>
>> 47     IsCOS        BwSrc field is a CLOSID        R/W    0
>>             (not an RMID)
>>
>> 46:44     –        Reserved            MBZ    0
>>
>> 43:32    BwSrc        Bandwidth Source        R/W    0
>>             (RMID or CLOSID)
>>
>> 31:0    BwType        Bandwidth configuration        R/W    0
>>             to track for this counter
>> ==========================================================================
>>
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
> 
> The changelog only describes the hardware interface yet the patch contains
> part hardware interface part new driver support for hardware interface.
> 

Yes. I may have to separate this.

>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> ---
>> v5: Moved assignment flags here (path 10/19 of v4).
>>      Added MON_CNTR_UNSET definition to initialize cntr_id's.
>>      More details in commit log.
>>      Renamed few fields in l3_qos_abmc_cfg for readability.
>>
>> v4: Added more descriptions.
>>      Changed the name abmc_ctr_id to ctr_id.
>>      Added L3_QOS_ABMC_DSC. Used for reading the configuration.
>>
>> v3: No changes.
>>
>> v2: No changes.
>> ---
>>   arch/x86/include/asm/msr-index.h       |  2 ++
>>   arch/x86/kernel/cpu/resctrl/internal.h | 40 ++++++++++++++++++++++++++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++
>>   3 files changed, 60 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h
>> b/arch/x86/include/asm/msr-index.h
>> index 263b2d9d00ed..5e44ff91f459 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -1175,6 +1175,8 @@
>>   #define MSR_IA32_SMBA_BW_BASE        0xc0000280
>>   #define MSR_IA32_EVT_CFG_BASE        0xc0000400
>>   #define MSR_IA32_L3_QOS_EXT_CFG        0xc00003ff
>> +#define MSR_IA32_L3_QOS_ABMC_CFG    0xc00003fd
>> +#define MSR_IA32_L3_QOS_ABMC_DSC    0xc00003fe
>>     /* MSR_IA32_VMX_MISC bits */
>>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 4cb1a5d014a3..6925c947682d 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -100,6 +100,18 @@ cpumask_any_housekeeping(const struct cpumask
>> *mask, int exclude_cpu)
>>   /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
>>   #define ABMC_ENABLE            BIT(0)
>>   +/*
>> + * Assignment flags for ABMC feature
>> + */
>> +#define ASSIGN_NONE    0
>> +#define ASSIGN_TOTAL    BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
>> +#define ASSIGN_LOCAL    BIT(QOS_L3_MBM_LOCAL_EVENT_ID)
> 
> These flags do not appear to be part of hardware interface and there
> is no explanation for what they mean or how they will be used. They are
> also not used in this patch. It is thus not possible to understand if
> they belong in this patch or is appropriate in this work.

ok. Will remove it from here. Will introduce later when it is used.

> 
>> +
>> +#define MON_CNTR_UNSET    U32_MAX
>> +
>> +/* Maximum assignable counters per resctrl group */
>> +#define MAX_CNTRS    2
>> +
>>   struct rdt_fs_context {
>>       struct kernfs_fs_context    kfc;
>>       bool                enable_cdpl2;
>> @@ -228,12 +240,14 @@ enum rdtgrp_mode {
>>    * @parent:            parent rdtgrp
>>    * @crdtgrp_list:        child rdtgroup node list
>>    * @rmid:            rmid for this rdtgroup
>> + * @cntr_id:            ABMC counter ids assigned to this group
> 
> struct mongroup is private to resctrl fs so it cannot contain an
> architecture specific feature. Having it contain a generic "cntr_id"
> may be ok at this point, but it should not be termed "ABMC counter".

Ok. Sure.

> 
>>    */
>>   struct mongroup {
>>       struct kernfs_node    *mon_data_kn;
>>       struct rdtgroup        *parent;
>>       struct list_head    crdtgrp_list;
>>       u32            rmid;
>> +    u32            cntr_id[MAX_CNTRS];
> 
> This is a significant addition yet is silently included as part of a patch
> that just introduces hardware interface. This is how resctrl will manage
> the hardware counters. It is significant since this is what dictates that it
> is resctrl fs that will manage the counters, which makes it important which
> interfaces are made available and from where it is called. Through
> this series I have also not come across a description of this architecture.
> With this introduction counters are maintained per monitor group, yet
> the new interface supports assigining counters per domain. There
> is no high level explanation of this architecture and the reader is forced
> to decipher it from the implementation making this work harder to review
> that necessary.
> 
> Would it be possible to present the fs and architecture code
> separately? I think doing so will make it easier to understand.

Sure. Will separate the two parts.

> 
>>   };
>>     /**
>> @@ -607,6 +621,32 @@ union cpuid_0x10_x_edx {
>>       unsigned int full;
>>   };
>>   +/*
>> + * ABMC counters can be configured by writing to L3_QOS_ABMC_CFG.
>> + * @bw_type        : Bandwidth configuration(supported by BMEC)
>> + *              to track this counter id.
> 
> Does "to track this counter id" mean "tracked by @cntr_id"?

Yea. Sure.

> 
>> + * @bw_src        : Bandwidth Source (RMID or CLOSID).
> 
> Please do not capitalize words mid sentence, like "Source"
> above, "Identifier", and "Enable" in two instances below.
> 
>> + * @reserved1        : Reserved.
>> + * @is_clos        : BwSrc field is a CLOSID (not an RMID).
> 
> Just stick to @bw_src.

Sure.

> 
>> + * @cntr_id        : Counter Identifier.
>> + * @reserved        : Reserved.
>> + * @cntr_en        : Tracking Enable bit.
> 
> Can this be more detailed about what happens when this bit is set/clear?

Sure. Will add it.
cfn_en = 1,  cntr_en= 0;
  Counter will be be configured and tracking is not enabled.

cfn_en = 1,  cntr_en= 1;
  Counter will be be configured and tracking will be enabled.


> 
>> + * @cfg_en        : Configuration Enable bit.
> 
> What is difference between "configuration enable" and "tracking enable"?
> What is relationship, if any, to @bw_type that is the bandwidth
> configuration?
> 
>> + */
>> +union l3_qos_abmc_cfg {
>> +    struct {
>> +        unsigned long    bw_type    :32,
>> +                bw_src    :12,
>> +                reserved1: 3,
>> +                is_clos    : 1,
>> +                cntr_id    : 5,
>> +                reserved : 9,
>> +                cntr_en    : 1,
>> +                cfg_en    : 1;
>> +    } split;
> 
> Please check the spacing in this data structure. Tabs are used inconsistently
> and the members are not lining up either.

Sure.

> 
>> +    unsigned long full;
>> +};
>> +
>>   void rdt_last_cmd_clear(void);
>>   void rdt_last_cmd_puts(const char *s);
>>   __printf(1, 2)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 91c5d45ac367..d2663f1345b7 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2505,6 +2505,7 @@ static void resctrl_abmc_set_one_amd(void *arg)
>>     static int _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>>   {
>> +    struct rdtgroup *prgrp, *crgrp;
>>       struct rdt_mon_domain *d;
>>         /*
>> @@ -2513,6 +2514,17 @@ static int _resctrl_abmc_enable(struct
>> rdt_resource *r, bool enable)
>>        */
>>       mbm_cntrs_init();
>>   +    /* Reset the cntr_id's for all the monitor groups */
>> +    list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
>> +        prgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
>> +        prgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
>> +        list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list,
>> +                    mon.crdtgrp_list) {
>> +            crgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
>> +            crgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
>> +        }
>> +    }
>> +
> 
> No. The counters are in the monitor group that is a structure that is private
> to the fs. The architecture code should not be accessing it. This should
> only be
> done by fs code.

Will move this code to FS part before coming here.

> 
>>       /*
>>        * Hardware counters will reset after switching the monitor mode.
>>        * Reset the architectural state so that reading of hardware
>> @@ -3573,6 +3585,8 @@ static int mkdir_rdt_prepare_rmid_alloc(struct
>> rdtgroup *rdtgrp)
>>           return ret;
>>       }
>>       rdtgrp->mon.rmid = ret;
>> +    rdtgrp->mon.cntr_id[0] = MON_CNTR_UNSET;
>> +    rdtgrp->mon.cntr_id[1] = MON_CNTR_UNSET;
>>         ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp,
>> &rdtgrp->mon.mon_data_kn);
>>       if (ret) {
>> @@ -4128,6 +4142,10 @@ static void __init rdtgroup_setup_default(void)
>>       rdtgroup_default.closid = RESCTRL_RESERVED_CLOSID;
>>       rdtgroup_default.mon.rmid = RESCTRL_RESERVED_RMID;
>>       rdtgroup_default.type = RDTCTRL_GROUP;
>> +
>> +    rdtgroup_default.mon.cntr_id[0] = MON_CNTR_UNSET;
>> +    rdtgroup_default.mon.cntr_id[1] = MON_CNTR_UNSET;
>> +
>>       INIT_LIST_HEAD(&rdtgroup_default.mon.crdtgrp_list);
>>         list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
> 
> Reinette
> 

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ