[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1a885fa4fbfe432c2e4bda2725d5bc78b1a6400.camel@redhat.com>
Date: Tue, 21 Jan 2025 17:56:58 -0500
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: vmx_pmu_caps_test fails on Skylake based CPUS due to read only
LBRs
On Fri, 2024-12-13 at 19:20 -0500, Maxim Levitsky wrote:
> On Thu, 2024-11-21 at 22:35 -0500, Maxim Levitsky wrote:
> > On Sun, 2024-11-03 at 18:32 -0500, Maxim Levitsky wrote:
> > > On Mon, 2024-10-28 at 08:55 -0700, Sean Christopherson wrote:
> > > > On Fri, Oct 18, 2024, Maxim Levitsky wrote:
> > > > > Hi,
> > > > >
> > > > > Our CI found another issue, this time with vmx_pmu_caps_test.
> > > > >
> > > > > On 'Intel(R) Xeon(R) Gold 6328HL CPU' I see that all LBR msrs (from/to and
> > > > > TOS), are always read only - even when LBR is disabled - once I disable the
> > > > > feature in DEBUG_CTL, all LBR msrs reset to 0, and you can't change their
> > > > > value manually. Freeze LBRS on PMI seems not to affect this behavior.
> > > > >
> > > > > I don't know if this is how the hardware is supposed to work (Intel's manual
> > > > > doesn't mention anything about this), or if it is something platform
> > > > > specific, because this system also was found to have LBRs enabled
> > > > > (IA32_DEBUGCTL.LBR == 1) after a fresh boot, as if BIOS left them enabled - I
> > > > > don't have an idea on why.
> > > > >
> > > > > The problem is that vmx_pmu_caps_test writes 0 to LBR_TOS via KVM_SET_MSRS,
> > > > > and KVM actually passes this write to actual hardware msr (this is somewhat
> > > > > wierd),
> > > >
> > > > When the "virtual" LBR event is active in host perf, the LBR MSRs are passed
> > > > through to the guest, and so KVM needs to propagate the guest values into hardware.
> > >
> > > Yes, but usually KVM_SET_MSRS doesn't touch hardware directly, even for registers/msrs
> > > that are passed through, but rather the relevant values are loaded when the guest vCPU
> > > is loaded and/or when the guest is entered.
> > > I don't know the details though.
> > >
> > >
> > > > > and since the MSR is not writable and silently drops writes instead,
> > > > > once the test tries to read it, it gets some random value instead.
> > > >
> > > > This just showed up in our testing too (delayed backport on our end). I haven't
> > > > (yet) tried debugging our setup, but is there any chance Intel PT is interfering?
> > > >
> > > > 33.3.1.2 Model Specific Capability Restrictions
> > > > Some processor generations impose restrictions that prevent use of
> > > > LBRs/BTS/BTM/LERs when software has enabled tracing with Intel Processor Trace.
> > > > On these processors, when TraceEn is set, updates of LBR, BTS, BTM, LERs are
> > > > suspended but the states of the corresponding IA32_DEBUGCTL control fields
> > > > remained unchanged as if it were still enabled. When TraceEn is cleared, the
> > > > LBR array is reset, and LBR/BTS/BTM/LERs updates will resume.
> > > > Further, reads of these registers will return 0, and writes will be dropped.
> > > >
> > > > The list of MSRs whose updates/accesses are restricted follows.
> > > >
> > > > • MSR_LASTBRANCH_x_TO_IP, MSR_LASTBRANCH_x_FROM_IP, MSR_LBR_INFO_x, MSR_LASTBRANCH_TOS
> > > > • MSR_LER_FROM_LIP, MSR_LER_TO_LIP
> > > > • MSR_LBR_SELECT
> > > >
> > > > For processors with CPUID DisplayFamily_DisplayModel signatures of 06_3DH,
> > > > 06_47H, 06_4EH, 06_4FH, 06_56H, and 06_5EH, the use of Intel PT and LBRs are
> > > > mutually exclusive.
> > > >
> > > > If Intel PT is NOT responsible, i.e. the behavior really is due to DEBUG_CTL.LBR=0,
> > > > then I don't see how KVM can sanely virtualize LBRs.
> > > >
> > >
> > > Hi!
> > >
> > >
> > > I will check PT influence soon, but to me it looks like the hardware implementation has changed.
> > > It is just too consistent:
> > >
> > > When DEBUG_CTL.LBR=1, the LBRs do work, I see all the registers update, although
> > > TOS does seem to be stuck at one value, but it does change sometimes, and it's non zero.
> > >
> > > The FROM/TO do show healthy amount of updates
> > >
> > > Note that I read all msrs using 'rdmsr' userspace tool.
> > >
> > > However as soon as I disable DEBUG_CTL.LBR, all these MSRs reset to 0, and can't be changed.
> >
> > Hi,
> > I tested this on another skylake based machine (Intel(R) Xeon(R) Silver 4214) and I see the same behavior:
> > LBR_TOS is readonly:
> >
> > It's 0 when LBRS disabled in DEBUG_CTL, and running (changes all the time as expected)
> > when LBRS are enabled in the DEBUG_CTL.
> >
> > IA32_RTIT_CTL.TraceEn is disabled (msr 0x570 is 0).
> >
> > Also on this machine BIOS didn't left LBRs running.
> >
> > I guess we need to at least disable the check in the unit test or at least
> > speak with someone from Intel to clarify on what is going on.
>
> Any update on this?
Hi,
I hate to sound like a broken record, but any update on this?
Best regards,
Maxim Levitsky
>
>
>
> > What do you think?
> >
> > Best regards,
> > Maxim Levitsky
> >
> > > I'll check this on another Skylake based machine and see if I see the same thing.
> > >
> > > Best regards,
> > > Maxim Levitsky
> > >
> >
> >
>
>
Powered by blists - more mailing lists