lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9571277-5d1b-39f2-696b-27000cd8fd3e@amd.com>
Date: Thu, 21 Nov 2024 18:26:21 -0600
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
 Babu Moger <babu.moger@....com>, corbet@....net, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com
Cc: fenghua.yu@...el.com, x86@...nel.org, hpa@...or.com, thuth@...hat.com,
 paulmck@...nel.org, rostedt@...dmis.org, akpm@...ux-foundation.org,
 xiongwei.song@...driver.com, pawan.kumar.gupta@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, perry.yuan@....com, sandipan.das@....com,
 kai.huang@...el.com, xiaoyao.li@...el.com, seanjc@...gle.com,
 jithu.joseph@...el.com, brijesh.singh@....com, xin3.li@...el.com,
 ebiggers@...gle.com, andrew.cooper3@...rix.com, mario.limonciello@....com,
 james.morse@....com, tan.shaopeng@...itsu.com, tony.luck@...el.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 peternewman@...gle.com, maciej.wieczor-retman@...el.com, eranian@...gle.com,
 jpoimboe@...nel.org, thomas.lendacky@....com
Subject: Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when
 mbm_cntr_assign is enabled



On 11/21/2024 6:22 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 11/18/2024 11:18 AM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 10/29/24 4:21 PM, Babu Moger wrote:
>>> Assign/unassign counters on resctrl group creation/deletion. Two 
>>> counters
>>> are required per group, one for MBM total event and one for MBM local
>>> event.
>>>
>>> There are a limited number of counters available for assignment. If 
>>> these
>>> counters are exhausted, the kernel will display the error message: 
>>> "Out of
>>> MBM assignable counters". However, it is not necessary to fail the
>>> creation of a group due to assignment failures. Users have the 
>>> flexibility
>>> to modify the assignments at a later time.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@....com>
>>> ---
>>> v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to 
>>> return void.
>>>      Updated couple of rdtgroup_unassign_cntrs() calls properly.
>>>      Updated function comments.
>>>
>>> v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
>>>      Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
>>>      Fixed the problem with unassigning the child MON groups of 
>>> CTRL_MON group.
>>>
>>> v7: Reworded the commit message.
>>>      Removed the reference of ABMC with mbm_cntr_assign.
>>>      Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
>>>
>>> v6: Removed the redundant comments on all the calls of
>>>      rdtgroup_assign_cntrs. Updated the commit message.
>>>      Dropped printing error message on every call of 
>>> rdtgroup_assign_cntrs.
>>>
>>> v5: Removed the code to enable/disable ABMC during the mount.
>>>      That will be another patch.
>>>      Added arch callers to get the arch specific data.
>>>      Renamed fuctions to match the other abmc function.
>>>      Added code comments for assignment failures.
>>>
>>> v4: Few name changes based on the upstream discussion.
>>>      Commit message update.
>>>
>>> v3: This is a new patch. Patch addresses the upstream comment to enable
>>>      ABMC feature by default if the feature is available.
>>> ---
>>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
>>>   1 file changed, 60 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index b0cce3dfd062..a8d21b0b2054 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
>>>       }
>>>   }
>>> +/*
>>> + * Called when a new group is created. If "mbm_cntr_assign" mode is 
>>> enabled,
>>> + * counters are automatically assigned. Each group can accommodate 
>>> two counters:
>>> + * one for the total event and one for the local event. Assignments 
>>> may fail
>>> + * due to the limited number of counters. However, it is not 
>>> necessary to fail
>>> + * the group creation and thus no failure is returned. Users have 
>>> the option
>>> + * to modify the counter assignments after the group has been created.
>>> + */
>>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>>> +{
>>> +    struct rdt_resource *r = 
>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>> +
>>> +    if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>> +        return;
>>> +
>>> +    if (is_mbm_total_enabled())
>>> +        rdtgroup_assign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +
>>> +    if (is_mbm_local_enabled())
>>> +        rdtgroup_assign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +}
>>> +
>>> +/*
>>> + * Called when a group is deleted. Counters are unassigned if it was in
>>> + * assigned state.
>>> + */
>>> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
>>> +{
>>> +    struct rdt_resource *r = 
>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>> +
>>> +    if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>> +        return;
>>> +
>>> +    if (is_mbm_total_enabled())
>>> +        rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +
>>> +    if (is_mbm_local_enabled())
>>> +        rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +}
>>> +
>>>   static int rdt_get_tree(struct fs_context *fc)
>>>   {
>>>       struct rdt_fs_context *ctx = rdt_fc2context(fc);
>>> @@ -2991,6 +3031,8 @@ static int rdt_get_tree(struct fs_context *fc)
>>>           if (ret < 0)
>>>               goto out_mongrp;
>>>           rdtgroup_default.mon.mon_data_kn = kn_mondata;
>>> +
>>> +        rdtgroup_assign_cntrs(&rdtgroup_default);
>>
>> I think counters should be assigned *before* the files exposing them
>> are added to resctrl.
> 
> Sure.
> 
>>
>>>       }
>>>       ret = rdt_pseudo_lock_init();
>>> @@ -3021,8 +3063,10 @@ static int rdt_get_tree(struct fs_context *fc)
>>>   out_psl:
>>>       rdt_pseudo_lock_release();
>>>   out_mondata:
>>> -    if (resctrl_arch_mon_capable())
>>> +    if (resctrl_arch_mon_capable()) {
>>> +        rdtgroup_unassign_cntrs(&rdtgroup_default);
>>>           kernfs_remove(kn_mondata);
>>
>> ... and here remove the files before taking away the data exposed by 
>> them.
> 
> Sure.
> 
>>
>>> +    }
>>>   out_mongrp:
>>>       if (resctrl_arch_mon_capable())
>>>           kernfs_remove(kn_mongrp);
>>> @@ -3201,6 +3245,7 @@ static void free_all_child_rdtgrp(struct 
>>> rdtgroup *rdtgrp)
>>>       head = &rdtgrp->mon.crdtgrp_list;
>>>       list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
>>> +        rdtgroup_unassign_cntrs(sentry);
>>>           free_rmid(sentry->closid, sentry->mon.rmid);
>>>           list_del(&sentry->mon.crdtgrp_list);
>>> @@ -3241,6 +3286,8 @@ static void rmdir_all_sub(void)
>>>           cpumask_or(&rdtgroup_default.cpu_mask,
>>>                  &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>>> +        rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>           free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>           kernfs_remove(rdtgrp->kn);
>>> @@ -3272,6 +3319,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>>       for_each_alloc_capable_rdt_resource(r)
>>>           reset_all_ctrls(r);
>>>       rmdir_all_sub();
>>> +    rdtgroup_unassign_cntrs(&rdtgroup_default);
>>>       rdt_pseudo_lock_release();
>>>       rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>>>       schemata_list_destroy();
>>> @@ -3280,6 +3328,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>>           resctrl_arch_disable_alloc();
>>>       if (resctrl_arch_mon_capable())
>>>           resctrl_arch_disable_mon();
>>> +
>>>       resctrl_mounted = false;
>>>       kernfs_kill_sb(sb);
>>>       mutex_unlock(&rdtgroup_mutex);
>>
>> Unnecessary hunk.
> 
> ok
> 
>>
>>> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct 
>>> kernfs_node *parent_kn,
>>>           goto out_unlock;
>>>       }
>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>> +
>>>       kernfs_activate(rdtgrp->kn);
>>>       /*
>>> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct 
>>> kernfs_node *parent_kn,
>>>       if (ret)
>>>           goto out_closid_free;
>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>> +
>>>       kernfs_activate(rdtgrp->kn);
>>>       ret = rdtgroup_init_alloc(rdtgrp);
>>
>> Please compare the above two hunks with earlier "x86/resctrl: 
>> Introduce cntr_id in mongroup for assignments".
>> Earlier patch initializes the counters within 
>> mkdir_rdt_prepare_rmid_alloc() while the above
>> hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is 
>> called. Could this fragmentation be avoided
>> with init done once within mkdir_rdt_prepare_rmid_alloc()?
> 
> It seems more appropriate to call rdtgroup_cntr_id_init() inside 
> mkdir_rdt_prepare(). This will initialize the cntr_id to MON_CNTR_UNSET.
> 
> And then call rdtgroup_assign_cntrs() inside 
> mkdir_rdt_prepare_rmid_alloc().
> 
> what do you think?
> 
> 
>>
>>> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct 
>>> kernfs_node *parent_kn,
>>>   out_del_list:
>>>       list_del(&rdtgrp->rdtgroup_list);
>>>   out_rmid_free:
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>       mkdir_rdt_prepare_rmid_free(rdtgrp);
>>>   out_closid_free:
>>>       closid_free(closid);
>>> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup 
>>> *rdtgrp, cpumask_var_t tmpmask)
>>>       update_closid_rmid(tmpmask, NULL);
>>>       rdtgrp->flags = RDT_DELETED;
>>> +
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>       /*
>>> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup 
>>> *rdtgrp, cpumask_var_t tmpmask)
>>>       cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>>       update_closid_rmid(tmpmask, NULL);
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>       closid_free(rdtgrp->closid);
>>
>> There is a potential problem here. rdtgroup_unassign_cntrs() attempts 
>> to remove counter from
>> all domains associated with the resource group. This may fail in any 
>> of the domains that results
>> in the counter not being marked as free in the global map and not 
>> reset the counter in the
>> resource group ... but the resource group is removed right after 
>> calling rdtgroup_unassign_cntrs().
>> In this scenario there is thus a counter that is considered to be in 
>> use but not assigned to any
>> resource group.
>>
>>>> From what I can tell there is a difference here between default 
>>>> resource group and the others:
>> on remount of resctrl the default resource group will maintain 
>> knowledge of the counter that could
>> not be unassigned. This means that unmount/remount of resctrl does not 
>> provide a real "clean slate"
>> when it comes to counter assignment. Is this intended?
>>
> 
> Yes. Agree. It is not intended.
> 
> How about adding bitmap_zero() inside rdt_get_tree() to address this 
> problem? Also need to reset the cntr_id in rdt_kill_sb().

I meant reset the cntr_id for the default group in rdt_kill_sb().
-- 
- Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ