[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27e33283-a18f-bc4a-eedd-1d30907c9efa@intel.com>
Date: Mon, 19 Jul 2021 13:43:42 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>
CC: <fenghua.yu@...el.com>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>, <hpa@...or.com>,
<pawel.szulik@...el.com>, "Luck, Tony" <tony.luck@...el.com>
Subject: Re: [PATCH] x86/resctrl: Fix default monitoring groups reporting
Hi Babu,
On 6/29/2021 11:07 AM, Babu Moger wrote:
> From: Babu Moger <Babu.Moger@....com>
>
> Creating a new sub monitoring group in the root /sys/fs/resctrl leads to
> getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes on
> the entire filesystem.
>
> Steps to reproduce.
> 1. #mount -t resctrl resctrl /sys/fs/resctrl/
>
> 2. #cd /sys/fs/resctrl/
>
> 3. #cat mon_data/mon_L3_00/mbm_total_bytes 23189832
>
> 4. #mkdir mon_groups/test1 (create sub monitor group)
>
> 5. #cat mon_data/mon_L3_00/mbm_total_bytes Unavailable
>
> When a new monitoring group is created, a new RMID is assigned to the new
> group. But the RMID is not active yet. When the events are read on the new
> RMID, it is expected to report the status as "Unavailable".
>
> When the user reads the events on the default monitoring group with
> multiple subgroups, the events on all sub groups are consolidated together.
> Currently, if any of the RMID reads report as "Unavailable", then
> everything will be reported as "Unavailable".
>
> Fix the issue by discarding the "Unavailable" reads and reporting all the
> successful RMID reads. This is not a problem on Intel systesm as Intel
systesm -> systems
> reports 0 on Inactive RMIDs.
>
> Reported-by: Paweł Szulik <pawel.szulik@...el.com>
> Signed-off-by: Babu Moger <Babu.Moger@....com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311
Is a "Fixes" available? If this is specific to AMD then could this be
the change that enabled AMD systems?
> ---
> arch/x86/kernel/cpu/resctrl/monitor.c | 27 +++++++++++++--------------
> 1 file changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index dbeaa8409313..9573a30c0587 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -285,15 +285,14 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
> return chunks >>= shift;
> }
>
> -static int __mon_event_count(u32 rmid, struct rmid_read *rr)
> +static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
> {
> struct mbm_state *m;
> u64 chunks, tval;
>
> tval = __rmid_read(rmid, rr->evtid);
> if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) {
> - rr->val = tval;
> - return -EINVAL;
> + return tval;
> }
> switch (rr->evtid) {
> case QOS_L3_OCCUP_EVENT_ID:
> @@ -305,12 +304,6 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
> case QOS_L3_MBM_LOCAL_EVENT_ID:
> m = &rr->d->mbm_local[rmid];
> break;
> - default:
> - /*
> - * Code would never reach here because
> - * an invalid event id would fail the __rmid_read.
> - */
> - return -EINVAL;
> }
>
> if (rr->first) {
> @@ -361,23 +354,29 @@ void mon_event_count(void *info)
> struct rdtgroup *rdtgrp, *entry;
> struct rmid_read *rr = info;
> struct list_head *head;
> + u64 ret_val;
>
> rdtgrp = rr->rgrp;
>
> - if (__mon_event_count(rdtgrp->mon.rmid, rr))
> - return;
> + ret_val = __mon_event_count(rdtgrp->mon.rmid, rr);
>
> /*
> - * For Ctrl groups read data from child monitor groups.
> + * For Ctrl groups read data from child monitor groups and
> + * add them together. Count events which are read successfully.
> + * Discard the rmid_read's reporting errors.
> */
> head = &rdtgrp->mon.crdtgrp_list;
>
> if (rdtgrp->type == RDTCTRL_GROUP) {
> list_for_each_entry(entry, head, mon.crdtgrp_list) {
> - if (__mon_event_count(entry->mon.rmid, rr))
> - return;
> + if (__mon_event_count(entry->mon.rmid, rr) == 0)
> + ret_val = 0;
> }
> }
> +
> + /* Report error if none of rmid_reads are successful */
> + if (ret_val)
> + rr->val = ret_val;
> }
>
> /*
>
With the commit message comments addressed:
Acked-by: Reinette Chatre <reinette.chatre@...el.com>
Thank you very much
Reinette
Powered by blists - more mailing lists