[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1178bf68-f355-1509-0849-b740d2906bae@amd.com>
Date: Wed, 16 Aug 2023 13:17:09 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de
Cc: fenghua.yu@...el.com, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, paulmck@...nel.org, akpm@...ux-foundation.org,
quic_neeraju@...cinc.com, rdunlap@...radead.org,
damien.lemoal@...nsource.wdc.com, songmuchun@...edance.com,
peterz@...radead.org, jpoimboe@...nel.org, pbonzini@...hat.com,
chang.seok.bae@...el.com, pawan.kumar.gupta@...ux.intel.com,
jmattson@...gle.com, daniel.sneddon@...ux.intel.com,
sandipan.das@....com, tony.luck@...el.com, james.morse@....com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
bagasdotme@...il.com, eranian@...gle.com,
christophe.leroy@...roup.eu, jarkko@...nel.org,
adrian.hunter@...el.com, quic_jiles@...cinc.com,
peternewman@...gle.com
Subject: Re: [PATCH v7 5/8] x86/resctrl: Unwind the errors inside
rdt_enable_ctx()
Hi Reinette,
On 8/15/23 17:47, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/11/2023 1:09 PM, Babu Moger wrote:
>> rdt_enable_ctx() enables the features provided during resctrl mount.
>>
>> Additions to rdt_enable_ctx() are required to also modify error paths
>> of rdt_enable_ctx() callers to ensure correct unwinding if errors
>> are encountered after calling rdt_enable_ctx(). This is error prone.
>>
>> Introduce rdt_disable_ctx() to refactor the error unwinding of
>> rdt_enable_ctx() to simplify future additions.
>>
>> Suggested-by: Reinette Chatre <reinette.chatre@...el.com>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++--------
>> 1 file changed, 33 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 3010e3a1394d..0805fac04401 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2377,19 +2377,44 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
>> struct rdtgroup *prgrp,
>> struct kernfs_node **mon_data_kn);
>>
>> +static void rdt_disable_ctx(struct rdt_fs_context *ctx)
>> +{
>> + if (ctx->enable_cdpl2)
>> + resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
>> +
>> + if (ctx->enable_cdpl3)
>> + resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
>> +
>> + if (ctx->enable_mba_mbps)
>> + set_mba_sc(false);
>> +}
>
> I am not sure if rdt_disable_ctx() should depend on the
> fs context (more later).
>
>> +
>> static int rdt_enable_ctx(struct rdt_fs_context *ctx)
>> {
>> int ret = 0;
>>
>> - if (ctx->enable_cdpl2)
>> + if (ctx->enable_cdpl2) {
>> ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, true);
>> + if (ret)
>> + goto out_disable;
>> + }
>>
>> - if (!ret && ctx->enable_cdpl3)
>> + if (ctx->enable_cdpl3) {
>> ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, true);
>> + if (ret)
>> + goto out_disable;
>> + }
>>
>> - if (!ret && ctx->enable_mba_mbps)
>> + if (ctx->enable_mba_mbps) {
>> ret = set_mba_sc(true);
>> + if (ret)
>> + goto out_disable;
>> + }
>> +
>> + return 0;
>>
>> +out_disable:
>> + rdt_disable_ctx(ctx);
>> return ret;
>
> This is not the exit pattern we usually follow. Also note
> that it could lead to incorrect behavior if there is an early failure
> in this function but all unwinding would end up being done in
> rdt_disable_ctx() because error unwinding is done based on whether
> a feature _should_ be enabled not whether it was enabled.
Yes. That is correct.
> Could you please instead have direct error handling by only undoing
> what was already done at the time of the error?
Sure.
>
>> }
>>
>> @@ -2497,13 +2522,13 @@ static int rdt_get_tree(struct fs_context *fc)
>> }
>>
>> ret = rdt_enable_ctx(ctx);
>> - if (ret < 0)
>> - goto out_cdp;
>> + if (ret)
>> + goto out;
>>
>> ret = schemata_list_create();
>> if (ret) {
>> schemata_list_destroy();
>> - goto out_mba;
>> + goto out_ctx;
>> }
>>
>> closid_init();
>> @@ -2562,11 +2587,8 @@ static int rdt_get_tree(struct fs_context *fc)
>> kernfs_remove(kn_info);
>> out_schemata_free:
>> schemata_list_destroy();
>> -out_mba:
>> - if (ctx->enable_mba_mbps)
>> - set_mba_sc(false);
>> -out_cdp:
>> - cdp_disable_all();
>> +out_ctx:
>> + rdt_disable_ctx(ctx);
>> out:
>> rdt_last_cmd_clear();
>> mutex_unlock(&rdtgroup_mutex);
>>
>>
>
> rdt_enable_ctx() is called in rdt_get_tree() and thus its work should
> also be cleaned up in rdt_kill_sb(). Note how the cleanup you replace
> here is also duplicated in rdt_kill_sb(), meaning the unwinding continues
> to be open coded and patch #7 propagates this.
> Could rdt_kill_sb() not also call rdt_disable_ctx()? This brings me
> back to the earlier comment about it depending on the fs context. I
> do not know if the context will be valid at this time. I do not
> think that the context is required though it could have no parameters
> and do cleanup as is done at the moment.
At rdt_kill_sb() the fs context is already freed. But, we can call
rdt_disable_ctx() with no parameter. We will have to depend on other
parameters to free the enabled features. We can use the same call both in
rdt_get_tree() (the failure path above) and in rdt_kill_sb().
The function rdt_disable_ctx will look like this.
+static void rdt_disable_ctx(void)
+{
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
+
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
+
+ if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
+ set_mba_sc(false);
+}
--
Thanks
Babu Moger
Powered by blists - more mailing lists