[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6d5dd6e-2dec-423c-af39-213f17b1a9db@paulmck-laptop>
Date: Thu, 4 Jan 2024 06:50:52 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Like Xu <like.xu@...ux.intel.com>, Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Luwei Kang <luwei.kang@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, Breno Leitao <leitao@...ian.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [BUG] Guest OSes die simultaneously (bisected)
On Wed, Jan 03, 2024 at 05:00:35PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 03, 2024 at 04:24:06PM -0800, Sean Christopherson wrote:
> > On Wed, Jan 03, 2024, Paul E. McKenney wrote:
> > > On Wed, Jan 03, 2024 at 02:22:23PM -0800, Paul E. McKenney wrote:
> > > > Hello!
> > > >
> > > > Since some time between v5.19 and v6.4, long-running rcutorture tests
> > > > would (rarely but intolerably often) have all guests on a given host die
> > > > simultaneously with something like an instruction fault or a segmentation
> > > > violation.
> > > >
> > > > Each bisection step required 20 hosts running 10 hours each, and
> > > > this eventually fingered commit c59a1f106f5c ("KVM: x86/pmu: Add
> > > > IA32_PEBS_ENABLE MSR emulation for extended PEBS"). Although this commit
> > > > is certainly messing with things that could possibly cause all manner
> > > > of mischief, I don't immediately see a smoking gun. Except that the
> > > > commit prior to this one is rock solid.
> > > > Just to make things a bit more exciting, bisection in mainline proved
> > > > to be problematic due to bugs of various kinds that hid this one. I was
> > > > therefore forced to bisect among the commits backported to the internal
> > > > v5.19-based kernel, which fingered the backported version of the patch
> > > > called out above.
> > >
> > > Ah, and so why do I believe that this is a problem in mainline rather
> > > than just (say) a backporting mistake?
> > >
> > > Because this issue was first located in v6.4, which already has this
> > > commit included.
> > >
> > > Thanx, Paul
> > >
> > > > Please note that this is not (yet) an emergency. I will just continue
> > > > to run rcutorture on v5.19-based hypervisors in the meantime.
> > > >
> > > > Any suggestions for debugging or fixing?
> >
> > This looks suspect:
> >
> > + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> > + int global_ctrl, pebs_enable;
> >
> > - arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
> > - arr[0].host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
> > - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
> > - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
> > - *nr = 1;
> > + *nr = 0;
> > + global_ctrl = (*nr)++;
> > + arr[global_ctrl] = (struct perf_guest_switch_msr){
> > + .msr = MSR_CORE_PERF_GLOBAL_CTRL,
> > + .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
> > + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> > + };
> >
> >
> > IIUC (always a big if with this code), the intent is that the guest's version of
> > PERF_GLOBAL_CTRL gets bits that are (a) not exclusive to the host and (b) not
> > being used for PEBS. (b) is necessary because PEBS generates records in memory
> > using virtual addresses, i.e. the CPU will write to memory using a virtual address
> > that is valid for the host but not the guest. And so PMU counters that are
> > configured to generate PEBS records need to be disabled while running the guest.
> >
> > Before that commit, the logic was:
> >
> > guest[PERF_GLOBAL_CTRL] = ctrl & ~host;
> > guest[PERF_GLOBAL_CTRL] &= ~pebs;
> >
> > But after, it's now:
> >
> > guest[PERF_GLOBAL_CTRL] = ctrl & (~host | ~pebs);
> >
> > I.e. the kernel is enabled counters in the guest that are not host-only OR not
> > PEBS. E.g. if only counter 0 is in use, it's using PEBS, but it's not exclusive
> > to the host, then the new code will yield (truncated to a single byte for sanity)
> >
> > 1 = 1 & (0xf | 0xe)
> >
> > and thus keep counter 0 enabled, whereas the old code would yield
> >
> > 1 = 1 & 0xf
> > 0 = 1 & 0xe
> >
> > A bit of a shot in the dark and completed untested, but I think this is the correct
> > fix?
>
> I am firing off some tests, and either way, thank you very much!!!
Woo-hoo!!! ;-)
Tested-by: Paul E. McKenney <paulmck@...nel.org>
Will you be sending a proper patch, or would you prefer that I do so?
In the latter case, I would need your Signed-off-by.
And again, thank you very much!!!
Thanx, Paul
> > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > index a08f794a0e79..92d5a3464cb2 100644
> > --- a/arch/x86/events/intel/core.c
> > +++ b/arch/x86/events/intel/core.c
> > @@ -4056,7 +4056,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> > arr[global_ctrl] = (struct perf_guest_switch_msr){
> > .msr = MSR_CORE_PERF_GLOBAL_CTRL,
> > .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
> > - .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> > + .guest = intel_ctrl & ~(cpuc->intel_ctrl_host_mask | pebs_mask),
> > };
> >
> > if (!x86_pmu.pebs)
> >
Powered by blists - more mailing lists