lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f36f5459-cf5a-4401-8a00-cca83c293fdc@amd.com>
Date: Tue, 20 Aug 2024 15:18:15 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
 tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
 kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
 jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
 rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
 jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
 maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v6 20/22] x86/resctrl: Enable AMD ABMC feature by default
 when supported



On 8/20/24 15:04, Moger, Babu wrote:
> Hi Reinette,
> 
> On 8/20/24 13:12, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 8/19/24 11:18 AM, Moger, Babu wrote:
>>> On 8/16/24 17:33, Reinette Chatre wrote:
>>>> On 8/6/24 3:00 PM, Babu Moger wrote:
>>
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> index 66febff2a3d3..d15fd1bde5f4 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> @@ -2756,6 +2756,23 @@ void resctrl_arch_mbm_cntr_assign_disable(void)
>>>>>        }
>>>>>    }
>>>>>    +void resctrl_arch_mbm_cntr_assign_configure(void)
>>>>> +{
>>>>> +    struct rdt_resource *r =
>>>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>>> +    bool enable = true;
>>>>> +
>>>>> +    mutex_lock(&rdtgroup_mutex);
>>>>> +
>>>>> +    if (r->mon.mbm_cntr_assignable) {
>>>>> +        if (!hw_res->mbm_cntr_assign_enabled)
>>>>> +            hw_res->mbm_cntr_assign_enabled = true;
>>>>> +        resctrl_abmc_set_one_amd(&enable);
>>>>
>>>> Earlier changelogs mentioned that counters are reset when ABMC is enabled.
>>>> How does that behave here when one CPU comes online? Consider the scenario
>>>> where
>>>> a system is booted without all CPUs online. ABMC is initially enabled on
>>>> all online
>>>> CPUs with this flow ... user space could start using resctrl fs and create
>>>> monitor groups that start accumulating architectural state. If the
>>>> remaining
>>>> CPUs come online at this point and this snippet enables ABMC, would it
>>>> reset
>>>> all counters? Should the architectural state be cleared?
>>>
>>> When new cpu comes online, it should inherit the abmc state which is set
>>> already. it should not force it either way. In that case, it is not
>>> required to reset the architectural state.
>>>
>>> Responded to your earlier comment.
>>> https://lore.kernel.org/lkml/0256b457-175d-4923-aa49-00e8e52b865b@amd.com/
>>>
>>>
>>>>
>>>> Also, it still does not look right that the architecture decides the
>>>> policy.
>>>> Could this enabling be moved to resctrl_online_cpu() for resctrl fs to
>>>> request architecture to enable assignable counters if it is supported?
>>>
>>> Sure. Will move the resctrl_arch_mbm_cntr_assign_configure() here with
>>> changes just to update the abmc state which is set during the init.
>>>
>>
>> I do not think we are seeing it the same way. In your earlier comment you
>> mention:
>>
>>> We need to set abmc state to "enabled" during the init when abmc is
>>> detected.  resctrl_late_init -> .. -> rdt_get_mon_l3_config
>>>
>>> This only happens once during the init.
>>
>>
>> I do not think that the ABMC state can be set during init since that runs
>> before the fs code and thus the arch code cannot be aware of the fs policy
>> that "mbm_assign_mode" is the default. This may become clear when you move
>> resctrl_arch_mbm_cntr_assign_configure() to resctrl_online_cpu() though
>> since I expect that the r->mon.mbm_cntr_assignable check will move
>> into the fs resctrl_online_cpu() that will call the arch helper to
>> set the state to enabled.
> 
> There are couple of problems here.
> 
> 1. Hotplug with ABMC enabled.
> 
>   System is running with ABMC enabled. Now, new cpu cames online.
>   The function resctrl_arch_mbm_cntr_assign_configure() will set the MSR
> MSR_IA32_L3_QOS_EXT_CFG to enable ABMC on the new CPU. This scenario works
> fine.
> 
> 
> 2. Hotplug with ABMC disabled.
>   Current code will force the system to enable ABMC on the new CPU.
>   That is not correct.
> 
> 
> We need to address both these cases.
> 
> 
> I was thinking of separating the functionality in
> resctrl_arch_mbm_cntr_assign_configure() into two.
> 
> a. Just set the mbm_cntr_assign_enabled to true during the init.
>    if (r->mon.mbm_cntr_assignable)
>        hw_res->mbm_cntr_assign_enabled = true;
> 
>    This is similar to rdtgroup_setup_default().  Isn't it?

I just noticed that I cannot access rdt_hw_resource here. I may have to
write another function resctrl_arch_mbm_cntr_assign_set_default() to do
this. What do you think?


> 
> 
> b. Change the functionality in resctrl_arch_mbm_cntr_assign_configure()
>    to update the MSR MSR_IA32_L3_QOS_EXT_CFG based on
> hw_res->mbm_cntr_assign_enabled.  Something like this.
> 
> 
>    void resctrl_arch_mbm_cntr_assign_configure(void)
> {
>     ---
>     if (r->mon.mbm_cntr_assignable &&  hw_res->mbm_cntr_assign_enabled)
>             abmc_set_one_amd(&enable);
>    ---
> }
> 
> 
> Yes.  The function resctrl_arch_mbm_cntr_assign_configure() will be called
> from resctrl_online_cpu().
> 
> Does it make sense?  Any other idea?

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ