[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <887f8946-6d2b-27bf-a49b-f83af05cbc68@amd.com>
Date: Tue, 19 Oct 2021 18:20:43 -0500
From: Babu Moger <babu.moger@....com>
To: James Morse <james.morse@....com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Cc: Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
shameerali.kolothum.thodi@...wei.com,
Jamie Iles <jamie@...iainc.com>,
D Scott Phillips OS <scott@...amperecomputing.com>,
lcherian@...vell.com, bobo.shaobowang@...wei.com,
tan.shaopeng@...itsu.com
Subject: Re: [PATCH v2 17/23] x86/resctrl: Abstract __rmid_read()
Hi James,
On 10/1/21 11:02 AM, James Morse wrote:
> __rmid_read() selects the specified eventid and returns the counter
> value from the msr. The error handling is architecture specific, and
> handled by the callers, rdtgroup_mondata_show() and __mon_event_count().
>
> Error handling should be handled by architecture specific code, as
> a different architecture may have different requirements. MPAM's
> counters can report that they are 'not ready', requiring a second
> read after a short delay. This should be hidden from resctrl.
>
> Make __rmid_read() the architecture specific function for reading
> a counter. Rename it resctrl_arch_rmid_read() and move the error
> handling into it.
>
> Signed-off-by: James Morse <james.morse@....com>
> ---
> Changes since v1:
> * Return EINVAL from the impossible case in __mon_event_count() instead
> of an x86 hardware specific value.
> ---
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 +--
> arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
> arch/x86/kernel/cpu/resctrl/monitor.c | 42 +++++++++++++++--------
> include/linux/resctrl.h | 1 +
> 4 files changed, 31 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 25baacd331e0..c8ca7184c6d9 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -579,9 +579,9 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>
> mon_event_read(&rr, r, d, rdtgrp, evtid, false);
>
> - if (rr.val & RMID_VAL_ERROR)
> + if (rr.err == -EIO)
> seq_puts(m, "Error\n");
> - else if (rr.val & RMID_VAL_UNAVAIL)
> + else if (rr.err == -EINVAL)
> seq_puts(m, "Unavailable\n");
> else
> seq_printf(m, "%llu\n", rr.val * hw_res->mon_scale);
This patch breaks the earlier fix
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.15-rc6&id=064855a69003c24bd6b473b367d364e418c57625
When the user reads the events on the default monitoring group with
multiple subgroups, the events on all subgroups are consolidated
together. In case if the last rmid read was resulted in error then whole
group will be reported as error. The err field needs to be cleared.
Please add this patch to clear the error.
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
b/arch/x86/kernel/cpu/resctrl/monitor.c
index 14bc843043da..0e4addf237ec 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -444,6 +444,8 @@ void mon_event_count(void *info)
/* Report error if none of rmid_reads are successful */
if (ret_val)
rr->val = ret_val;
+ else
+ rr->err = 0;
}
/*
Powered by blists - more mailing lists