[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALPaoCiO_ByaetFBAqSGK4_26jNsHD00gyx-u6aqfy9n_Ys7Ng@mail.gmail.com>
Date: Mon, 19 Dec 2022 11:31:45 +0100
From: Peter Newman <peternewman@...gle.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: fenghua.yu@...el.com, Babu.Moger@....com, bp@...en8.de,
dave.hansen@...ux.intel.com, eranian@...gle.com, hpa@...or.com,
james.morse@....com, linux-kernel@...r.kernel.org,
mingo@...hat.com, quic_jiles@...cinc.com, tan.shaopeng@...itsu.com,
tglx@...utronix.de, x86@...nel.org
Subject: Re: [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs
Hi Reinette,
On Sat, Dec 17, 2022 at 1:59 AM Reinette Chatre
<reinette.chatre@...el.com> wrote:
> On 12/14/2022 8:08 AM, Peter Newman wrote:
> > When creating a new monitoring group, the RMID allocated for it may have
> > been used by a group which was previously removed. In this case, the
> > hardware counters will have non-zero values which should be deducted
> > from what is reported in the new group's counts.
> >
> > resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
> > 0, causing the initial count to be charged to the new group. Resurrect
> > __rmid_read() and use it to initialize prev_msr correctly.
> >
> > Unlike before, __rmid_read() checks for error bits in the MSR read so
> > that callers don't need to.
> >
> > Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
> > Signed-off-by: Peter Newman <peternewman@...gle.com>
>
> This does look like a candidate for stable?
Yes, this bug is serious and reproducible. Every RMID reuse would
have up to one overflow's-worth of measurement error.
Should I elaborate on the impact more in the changelog?
>
> > ---
>
> It is helpful to have a summary here of what changed since previous version.
ok, I'll add this
> Thank you very much for catching and fixing this.
>
> Reviewed-by: Reinette Chatre <reinette.chatre@...el.com>
Thanks, Reinette!
-Peter
Powered by blists - more mailing lists