[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zzvtj8n1_ukhnRWT@agluck-desk3>
Date: Mon, 18 Nov 2024 17:44:47 -0800
From: "Luck, Tony" <tony.luck@...el.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: babu.moger@....com, Fenghua Yu <fenghua.yu@...el.com>,
Peter Newman <peternewman@...gle.com>,
Jonathan Corbet <corbet@....net>, x86@...nel.org,
James Morse <james.morse@....com>,
Jamie Iles <quic_jiles@...cinc.com>,
Randy Dunlap <rdunlap@...radead.org>,
"Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
patches@...ts.linux.dev
Subject: Re: [PATCH v9 3/9] x86/resctrl: Modify update_mba_bw() to use per
ctrl_mon group event
On Mon, Nov 18, 2024 at 04:51:38PM -0800, Reinette Chatre wrote:
> Hi Tony,
>
> On 11/18/24 4:01 PM, Luck, Tony wrote:
> > On Fri, Nov 15, 2024 at 10:21:01AM -0600, Moger, Babu wrote:
> >> Hi Tony,
> >
> > Thanks for looking at this patch.
> >
> >>
> >> On 11/13/2024 6:17 PM, Tony Luck wrote:
> >>> Instead of hard-coding the memory bandwidth local event as the
> >>> input to the mba_sc feedback look, use the event that the user
> >>> configured for each ctrl_mon group.
> >>>
> >>> Signed-off-by: Tony Luck <tony.luck@...el.com>
> >>> ---
> >>> arch/x86/kernel/cpu/resctrl/monitor.c | 23 ++++++++++++++++++-----
> >>> 1 file changed, 18 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> >>> index 7ef1a293cc13..2176e355e864 100644
> >>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> >>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> >>> @@ -752,20 +752,31 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> >>> u32 closid, rmid, cur_msr_val, new_msr_val;
> >>> struct mbm_state *pmbm_data, *cmbm_data;
> >>> struct rdt_ctrl_domain *dom_mba;
> >>> + enum resctrl_event_id evt_id;
> >>> struct rdt_resource *r_mba;
> >>> - u32 cur_bw, user_bw, idx;
> >>> struct list_head *head;
> >>> struct rdtgroup *entry;
> >>> + u32 cur_bw, user_bw;
> >>> - if (!is_mbm_local_enabled())
> >>> + if (!is_mbm_enabled())
> >>> return;
> >>> r_mba = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
> >>> + evt_id = rgrp->mba_mbps_event;
> >>> +
> >>> + if (WARN_ON_ONCE(!is_mbm_event(evt_id)))
> >>> + return;
> >>
> >> I feel this check is enough.
> >>
> >>> + if (WARN_ON_ONCE(evt_id == QOS_L3_MBM_LOCAL_EVENT_ID && !is_mbm_local_enabled()))
> >>> + return;
> >>> + if (WARN_ON_ONCE(evt_id == QOS_L3_MBM_TOTAL_EVENT_ID && !is_mbm_total_enabled()))
> >>> + return;
> >>
> >> These two checks are not necessary. You are already validating it while
> >> initializing(in patch 7).
> >
> > I added this in response to a comment on v7 from Reinette that evt_id
> > wasn't properly validated here - in conjuction with the change a few
> > lines earlier that relaxed the check for is_mbm_local_enabled() to
> > just is_mbm_enabled().
>
> right that patch had an issue ... the "initialize" code hardcoded support to be
> r->membw.mba_mbps_event = QOS_L3_MBM_LOCAL_EVENT_ID;
> without any checking and then the handler used a relaxed check of
> is_mbm_enabled()
>
> On a system that only supports total MBM the is_mbm_enabled() check will
> pass while the event used will be local MBM.
In the v9 series I believe all the necessary checks are made outside
of the update_mba_bw() function itself.
update_mba_bw() is only called when is_mba_sc() returns true. Which
is the value of:
rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc
which can only be set if mbm is enabled.
So instead of changing the check from is_mbm_local_enabled() to
is_mbm_enabled() it could be deleted.
rgrp->mba_mbps_event can only be set to QOS_L3_MBM_LOCAL_EVENT_ID
until patch 7 when the user can select QOS_L3_MBM_TOTAL_EVENT_ID
or patch 8 when the initiialization code can pick TOTAL on systems
that don't support LOCAL.
So all three of the WARN_ON_ONCE() calls are unnecessary.
Should I drop all these checks in v10?
>
> >
> > See: https://lore.kernel.org/r/bb30835f-5be9-44b4-8544-2f528e7fc573@intel.com
> >
> > In theory all of these tests could be dropped. As you point out the
> > sanity checks are done higher in the call sequence. But some folks
> > like the "belt and braces" approach to such sanity checks.
>
> If that "higher in the call sequence" can be trusted, yes. That was not the
> case when I made those statements. Sprinkling WARN() that continues execution
> in a known bad state does not seem safe to me either.
>
> Reinette
-Tony
Powered by blists - more mailing lists