lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <225f198d-092d-437d-9c5b-015517201218@amd.com>
Date:   Thu, 19 Nov 2020 16:44:27 -0600
From:   Babu Moger <babu.moger@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, bp@...en8.de
Cc:     fenghua.yu@...el.com, x86@...nel.org, linux-kernel@...r.kernel.org,
        mingo@...hat.com, hpa@...or.com, tglx@...utronix.de
Subject: Re: [PATCH] x86/resctrl: Fix AMD L3 QOS CDP enable/disable

Hi Reinette,


On 11/18/20 4:18 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 11/6/2020 12:14 PM, Babu Moger wrote:
>> When the AMD QoS feature CDP(code and data prioritization) is enabled
>> or disabled, the CDP bit in MSR 0000_0C81 is written on one of the
>> cpus in L3 domain(core complex). That is not correct. The CDP bit needs
>> to be updated all the logical cpus in the domain.
> 
> Could you please use CPU instead of cpu throughout, in commit message as
> well as the new code comments?

Sure. Will do.

> 
>>
>> This was not spelled out clearly in the spec earlier. The specification
>> has been updated. The updated specification, "AMD64 Technology Platform
>> Quality of Service Extensions Publication # 56375 Revision: 1.02 Issue
>> Date: October 2020" is available now. Refer the section: Code and Data
>> Prioritization.
>>
>> Fix the issue by adding a new flag arch_needs_update_all in rdt_cache
>> data structure.
> 
> I understand that naming is hard and could be a sticky point. Even so, I
> am concerned that this name is too generic. For example, there are other
> cache settings that are successfully set on a single CPU in the L3 domain
> (the bitmasks for example). This new name and its description in the code
> comments below does not make it clear which cache settings it applies to.
> 
> I interpret this change to mean that the L[23]_QOS_CFG MSR has CPU scope
> while the other L3 QoS configuration registers have the same scope as the
> L3 cache. Could this new variable thus perhaps be named
> "arch_has_per_cpu_cfg"? I considered "arch_has_per_cpu_cdp" but when a new
> field is added to that register it may cause confusion.

Sounds good. Will change it to arch_has_per_cpu_cfg.
> 
>> The documentation can be obtained at the links below:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdeveloper.amd.com%2Fwp-content%2Fresources%2F56375.pdf&amp;data=04%7C01%7Cbabu.moger%40amd.com%7C3b793291233443b27f6d08d88c0fdc9f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637413347449195250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2B4AUG%2FbAeq70s0XuZ9J%2FOTTEFO8EypLvcR6yBuWE8U4%3D&amp;reserved=0
>>
>> Link:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=04%7C01%7Cbabu.moger%40amd.com%7C3b793291233443b27f6d08d88c0fdc9f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637413347449195250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=4lzj2ivqxvFaLC99TOIJEGU3p6CmlBLCPHT80LlKsNE%3D&amp;reserved=0
>>
>>
>> Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature")
>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>>   arch/x86/kernel/cpu/resctrl/core.c     |    3 +++
>>   arch/x86/kernel/cpu/resctrl/internal.h |    3 +++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c |    9 +++++++--
>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
>> b/arch/x86/kernel/cpu/resctrl/core.c
>> index e5f4ee8f4c3b..142c92a12254 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -570,6 +570,8 @@ static void domain_add_cpu(int cpu, struct
>> rdt_resource *r)
>>         if (d) {
>>           cpumask_set_cpu(cpu, &d->cpu_mask);
>> +        if (r->cache.arch_needs_update_all)
>> +            rdt_domain_reconfigure_cdp(r);
>>           return;
>>       }
>>   @@ -943,6 +945,7 @@ static __init void rdt_init_res_defs_amd(void)
>>               r->rid == RDT_RESOURCE_L2CODE) {
>>               r->cache.arch_has_sparse_bitmaps = true;
>>               r->cache.arch_has_empty_bitmaps = true;
>> +            r->cache.arch_needs_update_all = true;
>>           } else if (r->rid == RDT_RESOURCE_MBA) {
>>               r->msr_base = MSR_IA32_MBA_BW_BASE;
>>               r->msr_update = mba_wrmsr_amd;
> 
> The current pattern is to set these flags on all the architectures. Could
> you thus please set the flag within rdt_init_defs_intel()? I confirmed
> that the scope is the same as the cache domain in Intel RDT so the flag
> should be false.

Yes, Will add that.

> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 80fa997fae60..d23262d59a51 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -360,6 +360,8 @@ struct msr_param {
>>    *            executing entities
>>    * @arch_has_sparse_bitmaps:    True if a bitmap like f00f is valid.
>>    * @arch_has_empty_bitmaps:    True if the '0' bitmap is valid.
>> + * @arch_needs_update_all:    True if arch needs to update the cache
>> + *                settings on all the cpus in the domain.
> 
> Please do update this to make it clear what "cache settings" are referred
> to. Since this is in struct rdt_cache perhaps something like "QOS_CFG
> register for this cache level has CPU scope."

Sure.
> 
>>    */
>>   struct rdt_cache {
>>       unsigned int    cbm_len;
>> @@ -369,6 +371,7 @@ struct rdt_cache {
>>       unsigned int    shareable_bits;
>>       bool        arch_has_sparse_bitmaps;
>>       bool        arch_has_empty_bitmaps;
>> +    bool        arch_needs_update_all;
>>   };
>>     /**
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index af323e2e3100..a005e90b373a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1905,8 +1905,13 @@ static int set_cache_qos_cfg(int level, bool enable)
>>         r_l = &rdt_resources_all[level];
>>       list_for_each_entry(d, &r_l->domains, list) {
>> -        /* Pick one CPU from each domain instance to update MSR */
>> -        cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
>> +        if (r_l->cache.arch_needs_update_all)
>> +            /* Pick all the cpus in the domain instance */
>> +            for_each_cpu(cpu, &d->cpu_mask)
>> +                cpumask_set_cpu(cpu, cpu_mask);
>> +        else
>> +            /* Pick one CPU from each domain instance to update MSR */
>> +            cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
>>       }
>>       cpu = get_cpu();
>>       /* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
>>
> 
> The solution looks good to me, thank you very much.

Thanks for the review.
-Babu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ