[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <007b8995-8865-40e9-ad74-33b5562d5dcf@amd.com>
Date: Sat, 21 Dec 2024 07:45:09 -0600
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
Babu Moger <babu.moger@....com>, corbet@....net, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
tony.luck@...el.com, peternewman@...gle.com
Cc: fenghua.yu@...el.com, x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
xiongwei.song@...driver.com, pawan.kumar.gupta@...ux.intel.com,
daniel.sneddon@...ux.intel.com, jpoimboe@...nel.org, perry.yuan@....com,
sandipan.das@....com, kai.huang@...el.com, xiaoyao.li@...el.com,
seanjc@...gle.com, xin3.li@...el.com, andrew.cooper3@...rix.com,
ebiggers@...gle.com, mario.limonciello@....com, james.morse@....com,
tan.shaopeng@...itsu.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
eranian@...gle.com
Subject: Re: [PATCH v10 18/24] x86/resctrl: Auto assign/unassign counters when
mbm_cntr_assign is enabled
Hi Reinette,
On 12/19/2024 5:39 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 12/12/24 12:15 PM, Babu Moger wrote:
>> Assign/unassign counters on resctrl group creation/deletion. Two counters
>> are required per group, one for MBM total event and one for MBM local
>> event.
>>
>> There are a limited number of counters available for assignment. If these
>> counters are exhausted, the kernel will display the error message: "Out of
>> MBM assignable counters". However, it is not necessary to fail the
>> creation of a group due to assignment failures. Users have the flexibility
>> to modify the assignments at a later time.
>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>> ---
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 81 +++++++++++++++++++++++++-
>> 1 file changed, 79 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index a71a8389b649..5acae525881a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -920,6 +920,25 @@ static int rdtgroup_available_mbm_cntrs_show(struct kernfs_open_file *of,
>> return ret;
>> }
>>
>> +static void mbm_cntr_reset(struct rdt_resource *r)
>> +{
>> + struct rdt_mon_domain *dom;
>> +
>> + /*
>> + * Hardware counters will reset after switching the monitor mode.
>> + * Reset the architectural state so that reading of hardware
>> + * counter is not considered as an overflow in the next update.
>> + * Also reset the domain counter bitmap.
>> + */
>> + if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>> + memset(dom->cntr_cfg, 0,
>> + sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs);
>> + resctrl_arch_reset_rmid_all(r, dom);
>
> This looks to be missing reset of resctrl monitor state (from get_mbm_state()).
Yes. Will do.
>
> ...
>
>> static int rdt_get_tree(struct fs_context *fc)
>> {
>> struct rdt_fs_context *ctx = rdt_fc2context(fc);
>> @@ -3023,6 +3082,8 @@ static int rdt_get_tree(struct fs_context *fc)
>> if (ret < 0)
>> goto out_info;
>>
>> + rdtgroup_assign_cntrs(&rdtgroup_default);
>> +
>> ret = mkdir_mondata_all(rdtgroup_default.kn,
>> &rdtgroup_default, &kn_mondata);
>> if (ret < 0)
>
> If this mkdir_mondata_all() fails it calls "goto out_mongrp" ...
Sure.
>
>> @@ -3058,8 +3119,10 @@ static int rdt_get_tree(struct fs_context *fc)
>> out_psl:
>> rdt_pseudo_lock_release();
>> out_mondata:
>> - if (resctrl_arch_mon_capable())
>> + if (resctrl_arch_mon_capable()) {
>> kernfs_remove(kn_mondata);
>> + rdtgroup_unassign_cntrs(&rdtgroup_default);
>> + }
>> out_mongrp:
>> if (resctrl_arch_mon_capable())
>> kernfs_remove(kn_mongrp);
>
> Looks like this will miss counter cleanup on failure of mkdir_mondata_all().
Sure.
>
>> @@ -3238,6 +3301,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
>>
>> head = &rdtgrp->mon.crdtgrp_list;
>> list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
>> + rdtgroup_unassign_cntrs(sentry);
>> free_rmid(sentry->closid, sentry->mon.rmid);
>> list_del(&sentry->mon.crdtgrp_list);
>>
>> @@ -3278,6 +3342,8 @@ static void rmdir_all_sub(void)
>> cpumask_or(&rdtgroup_default.cpu_mask,
>> &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>>
>> + rdtgroup_unassign_cntrs(rdtgrp);
>> +
>> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>
>> kernfs_remove(rdtgrp->kn);
>> @@ -3309,6 +3375,8 @@ static void rdt_kill_sb(struct super_block *sb)
>> for_each_alloc_capable_rdt_resource(r)
>> reset_all_ctrls(r);
>> rmdir_all_sub();
>> + rdtgroup_unassign_cntrs(&rdtgroup_default);
>> + mbm_cntr_reset(&rdt_resources_all[RDT_RESOURCE_L3].r_resctrl);
>> rdt_pseudo_lock_release();
>> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>> schemata_list_destroy();
>> @@ -3772,6 +3840,8 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
>> }
>> rdtgrp->mon.rmid = ret;
>>
>> + rdtgroup_assign_cntrs(rdtgrp);
>> +
>> ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
>> if (ret) {
>> rdt_last_cmd_puts("kernfs subdir error\n");
>
> Cleanup of assigned counters if mkdir_mondata_all() fails seems to be missing here also.
Sure.
Thanks
Babu
Powered by blists - more mailing lists