[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoChW=3T2EmrcW+eLEnDUY00rsRSQarTN3c0hSDX-FDRqvw@mail.gmail.com>
Date: Wed, 7 Jun 2023 14:51:40 +0200
From: Peter Newman <peternewman@...gle.com>
To: James Morse <james.morse@....com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Babu Moger <Babu.Moger@....com>,
shameerali.kolothum.thodi@...wei.com,
D Scott Phillips OS <scott@...amperecomputing.com>,
carl@...amperecomputing.com, lcherian@...vell.com,
bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
Jamie Iles <quic_jiles@...cinc.com>,
Xin Hao <xhao@...ux.alibaba.com>, dfustini@...libre.com
Subject: Re: [PATCH v4 12/24] x86/resctrl: Make resctrl_arch_rmid_read() retry
when it is interrupted
Hi James,
On Tue, Jun 6, 2023 at 7:03 PM James Morse <james.morse@....com> wrote:
> On 06/06/2023 09:49, Peter Newman wrote:
> > It looks like if __rmid_read() is interrupted by an occupancy counter
> > read between writing QM_EVTSEL and reading QM_CTR, it will not perform
> > any update to am->prev_msr, and the interrupted read will return the
> > same counter value as in the interrupting read.
>
> Yup, that's a problem. I was only looking at the mbm state in memory, not the CPU register.
> I think the fix is to read back QM_EVTSEL after reading QM_CTR. I'll do this in
> __rmid_read() to avoid returning -EINTR. It creates two retry loops which is annoying, but
> making the window larger means you're more likely to see false positives.
>
> ----------------------------%<----------------------------
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl
> /monitor.c
> index e24390d2e661..aeba035bb680 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -101,6 +101,7 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned
> long val)
>
> static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
> {
> + u32 _rmid, _eventid;
> u64 msr_val;
>
> /*
> @@ -110,9 +111,15 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
> * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
> * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
> * are error bits.
> + * QM_EVTSEL is re-read to detect if this function was interrupted by
> + * another call, meaning the QM_CTR value may belong to a different
> + * event.
> */
> - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> - rdmsrl(MSR_IA32_QM_CTR, msr_val);
> + do {
> + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> + rdmsrl(MSR_IA32_QM_CTR, msr_val);
> + rdmsr(MSR_IA32_QM_EVTSEL, _eventid, _rmid);
> + } while (eventid != _eventid || rmid != _rmid);
>
> if (msr_val & RMID_VAL_ERROR)
> return -EIO;
I happen to be tracking the cost of resctrl_arch_rmid_read() calls, so
I measured the impact of your fix on my AMD EPYC 7B12:
with both this and the soft RMID series[1] applied:
Base switch 7955 0.23%
Hard RMID Switch 8476 6.80%
Soft RMID Switch 10173 28.17%
CLOSID+Soft RMID 10740 35.32%
then adding EVTSEL read-back patch:
Base switch 7985
Hard RMID Switch 8540 6.96%
Soft RMID Switch 11015 37.95%
CLOSID+Soft RMID 11590 45.16%
The Soft RMID switches contain two __rmid_read() calls, so this
implies each QM_EVTSEL read-back is around 420 cycles on this AMD
implementation.
Even if you don't agree with my plan to add resctrl_arch_rmid_read()
calls to context switches, there should be cheaper ways to handle
this.
-Peter
[1] https://lore.kernel.org/lkml/20230421141723.2405942-4-peternewman@google.com/
Powered by blists - more mailing lists