lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b80a9e76-764a-4303-ada8-aa4d08559664@amd.com>
Date: Tue, 20 Aug 2024 15:04:31 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
 tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
 kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
 jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
 rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
 jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
 maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v6 20/22] x86/resctrl: Enable AMD ABMC feature by default
 when supported

Hi Reinette,

On 8/20/24 13:12, Reinette Chatre wrote:
> Hi Babu,
> 
> On 8/19/24 11:18 AM, Moger, Babu wrote:
>> On 8/16/24 17:33, Reinette Chatre wrote:
>>> On 8/6/24 3:00 PM, Babu Moger wrote:
> 
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index 66febff2a3d3..d15fd1bde5f4 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -2756,6 +2756,23 @@ void resctrl_arch_mbm_cntr_assign_disable(void)
>>>>        }
>>>>    }
>>>>    +void resctrl_arch_mbm_cntr_assign_configure(void)
>>>> +{
>>>> +    struct rdt_resource *r =
>>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>> +    bool enable = true;
>>>> +
>>>> +    mutex_lock(&rdtgroup_mutex);
>>>> +
>>>> +    if (r->mon.mbm_cntr_assignable) {
>>>> +        if (!hw_res->mbm_cntr_assign_enabled)
>>>> +            hw_res->mbm_cntr_assign_enabled = true;
>>>> +        resctrl_abmc_set_one_amd(&enable);
>>>
>>> Earlier changelogs mentioned that counters are reset when ABMC is enabled.
>>> How does that behave here when one CPU comes online? Consider the scenario
>>> where
>>> a system is booted without all CPUs online. ABMC is initially enabled on
>>> all online
>>> CPUs with this flow ... user space could start using resctrl fs and create
>>> monitor groups that start accumulating architectural state. If the
>>> remaining
>>> CPUs come online at this point and this snippet enables ABMC, would it
>>> reset
>>> all counters? Should the architectural state be cleared?
>>
>> When new cpu comes online, it should inherit the abmc state which is set
>> already. it should not force it either way. In that case, it is not
>> required to reset the architectural state.
>>
>> Responded to your earlier comment.
>> https://lore.kernel.org/lkml/0256b457-175d-4923-aa49-00e8e52b865b@amd.com/
>>
>>
>>>
>>> Also, it still does not look right that the architecture decides the
>>> policy.
>>> Could this enabling be moved to resctrl_online_cpu() for resctrl fs to
>>> request architecture to enable assignable counters if it is supported?
>>
>> Sure. Will move the resctrl_arch_mbm_cntr_assign_configure() here with
>> changes just to update the abmc state which is set during the init.
>>
> 
> I do not think we are seeing it the same way. In your earlier comment you
> mention:
> 
>> We need to set abmc state to "enabled" during the init when abmc is
>> detected.  resctrl_late_init -> .. -> rdt_get_mon_l3_config
>>
>> This only happens once during the init.
> 
> 
> I do not think that the ABMC state can be set during init since that runs
> before the fs code and thus the arch code cannot be aware of the fs policy
> that "mbm_assign_mode" is the default. This may become clear when you move
> resctrl_arch_mbm_cntr_assign_configure() to resctrl_online_cpu() though
> since I expect that the r->mon.mbm_cntr_assignable check will move
> into the fs resctrl_online_cpu() that will call the arch helper to
> set the state to enabled.

There are couple of problems here.

1. Hotplug with ABMC enabled.

  System is running with ABMC enabled. Now, new cpu cames online.
  The function resctrl_arch_mbm_cntr_assign_configure() will set the MSR
MSR_IA32_L3_QOS_EXT_CFG to enable ABMC on the new CPU. This scenario works
fine.


2. Hotplug with ABMC disabled.
  Current code will force the system to enable ABMC on the new CPU.
  That is not correct.


We need to address both these cases.


I was thinking of separating the functionality in
resctrl_arch_mbm_cntr_assign_configure() into two.

a. Just set the mbm_cntr_assign_enabled to true during the init.
   if (r->mon.mbm_cntr_assignable)
       hw_res->mbm_cntr_assign_enabled = true;

   This is similar to rdtgroup_setup_default().  Isn't it?


b. Change the functionality in resctrl_arch_mbm_cntr_assign_configure()
   to update the MSR MSR_IA32_L3_QOS_EXT_CFG based on
hw_res->mbm_cntr_assign_enabled.  Something like this.


   void resctrl_arch_mbm_cntr_assign_configure(void)
{
    ---
    if (r->mon.mbm_cntr_assignable &&  hw_res->mbm_cntr_assign_enabled)
            abmc_set_one_amd(&enable);
   ---
}


Yes.  The function resctrl_arch_mbm_cntr_assign_configure() will be called
from resctrl_online_cpu().

Does it make sense?  Any other idea?
-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ