lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zx-z5sRKCXAXysqv@google.com>
Date: Mon, 28 Oct 2024 08:55:18 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: vmx_pmu_caps_test fails on Skylake based CPUS due to read only LBRs

On Fri, Oct 18, 2024, Maxim Levitsky wrote:
> Hi,
> 
> Our CI found another issue, this time with vmx_pmu_caps_test.
> 
> On 'Intel(R) Xeon(R) Gold 6328HL CPU' I see that all LBR msrs (from/to and
> TOS), are always read only - even when LBR is disabled - once I disable the
> feature in DEBUG_CTL, all LBR msrs reset to 0, and you can't change their
> value manually.  Freeze LBRS on PMI seems not to affect this behavior.
> 
> I don't know if this is how the hardware is supposed to work (Intel's manual
> doesn't mention anything about this), or if it is something platform
> specific, because this system also was found to have LBRs enabled
> (IA32_DEBUGCTL.LBR == 1) after a fresh boot, as if BIOS left them enabled - I
> don't have an idea on why.
> 
> The problem is that vmx_pmu_caps_test writes 0 to LBR_TOS via KVM_SET_MSRS,
> and KVM actually passes this write to actual hardware msr (this is somewhat
> wierd),

When the "virtual" LBR event is active in host perf, the LBR MSRs are passed
through to the guest, and so KVM needs to propagate the guest values into hardware.

> and since the MSR is not writable and silently drops writes instead,
> once the test tries to read it, it gets some random value instead.

This just showed up in our testing too (delayed backport on our end).  I haven't
(yet) tried debugging our setup, but is there any chance Intel PT is interfering?

  33.3.1.2 Model Specific Capability Restrictions
  Some processor generations impose restrictions that prevent use of
  LBRs/BTS/BTM/LERs when software has enabled tracing with Intel Processor Trace.
  On these processors, when TraceEn is set, updates of LBR, BTS, BTM, LERs are
  suspended but the states of the corresponding IA32_DEBUGCTL control fields
  remained unchanged as if it were still enabled. When TraceEn is cleared, the
  LBR array is reset, and LBR/BTS/BTM/LERs updates will resume.
  Further, reads of these registers will return 0, and writes will be dropped.

  The list of MSRs whose updates/accesses are restricted follows.
  
    • MSR_LASTBRANCH_x_TO_IP, MSR_LASTBRANCH_x_FROM_IP, MSR_LBR_INFO_x, MSR_LASTBRANCH_TOS
    • MSR_LER_FROM_LIP, MSR_LER_TO_LIP
    • MSR_LBR_SELECT
  
  For processors with CPUID DisplayFamily_DisplayModel signatures of 06_3DH,
  06_47H, 06_4EH, 06_4FH, 06_56H, and 06_5EH, the use of Intel PT and LBRs are
  mutually exclusive.

If Intel PT is NOT responsible, i.e. the behavior really is due to DEBUG_CTL.LBR=0,
then I don't see how KVM can sanely virtualize LBRs.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ