lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoChW=3T2EmrcW+eLEnDUY00rsRSQarTN3c0hSDX-FDRqvw@mail.gmail.com>
Date:   Wed, 7 Jun 2023 14:51:40 +0200
From:   Peter Newman <peternewman@...gle.com>
To:     James Morse <james.morse@....com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        Fenghua Yu <fenghua.yu@...el.com>,
        Reinette Chatre <reinette.chatre@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Babu Moger <Babu.Moger@....com>,
        shameerali.kolothum.thodi@...wei.com,
        D Scott Phillips OS <scott@...amperecomputing.com>,
        carl@...amperecomputing.com, lcherian@...vell.com,
        bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
        xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
        Jamie Iles <quic_jiles@...cinc.com>,
        Xin Hao <xhao@...ux.alibaba.com>, dfustini@...libre.com
Subject: Re: [PATCH v4 12/24] x86/resctrl: Make resctrl_arch_rmid_read() retry
 when it is interrupted

Hi James,

On Tue, Jun 6, 2023 at 7:03 PM James Morse <james.morse@....com> wrote:
> On 06/06/2023 09:49, Peter Newman wrote:
> > It looks like if __rmid_read() is interrupted by an occupancy counter
> > read between writing QM_EVTSEL and reading QM_CTR, it will not perform
> > any update to am->prev_msr, and the interrupted read will return the
> > same counter value as in the interrupting read.
>
> Yup, that's a problem. I was only looking at the mbm state in memory, not the CPU register.
> I think the fix is to read back QM_EVTSEL after reading QM_CTR. I'll do this in
> __rmid_read() to avoid returning -EINTR. It creates two retry loops which is annoying, but
> making the window larger means you're more likely to see false positives.
>
> ----------------------------%<----------------------------
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl
> /monitor.c
> index e24390d2e661..aeba035bb680 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -101,6 +101,7 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned
>  long val)
>
>  static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
>  {
> +       u32 _rmid, _eventid;
>         u64 msr_val;
>
>         /*
> @@ -110,9 +111,15 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
>          * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
>          * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
>          * are error bits.
> +        * QM_EVTSEL is re-read to detect if this function was interrupted by
> +        * another call, meaning the QM_CTR value may belong to a different
> +        * event.
>          */
> -       wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> -       rdmsrl(MSR_IA32_QM_CTR, msr_val);
> +       do {
> +               wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> +               rdmsrl(MSR_IA32_QM_CTR, msr_val);
> +               rdmsr(MSR_IA32_QM_EVTSEL, _eventid, _rmid);
> +       } while (eventid != _eventid || rmid != _rmid);
>
>         if (msr_val & RMID_VAL_ERROR)
>                 return -EIO;

I happen to be tracking the cost of resctrl_arch_rmid_read() calls, so
I measured the impact of your fix on my AMD EPYC 7B12:

with both this and the soft RMID series[1] applied:

Base switch         7955     0.23%
Hard RMID Switch    8476     6.80%
Soft RMID Switch    10173   28.17%
CLOSID+Soft RMID    10740   35.32%

then adding EVTSEL read-back patch:

Base switch         7985
Hard RMID Switch    8540     6.96%
Soft RMID Switch    11015   37.95%
CLOSID+Soft RMID    11590   45.16%

The Soft RMID switches contain two __rmid_read() calls, so this
implies each QM_EVTSEL read-back is around 420 cycles on this AMD
implementation.

Even if you don't agree with my plan to add resctrl_arch_rmid_read()
calls to context switches, there should be cheaper ways to handle
this.

-Peter

[1] https://lore.kernel.org/lkml/20230421141723.2405942-4-peternewman@google.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ