lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240913173323.6guq4p2h4z7ulgr3@desk>
Date: Fri, 13 Sep 2024 10:33:23 -0700
From: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
To: Jon Kohler <jon@...anix.com>
Cc: Chao Gao <chao.gao@...el.com>, Thomas Gleixner <tglx@...utronix.de>,
	Borislav Petkov <bp@...en8.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>, X86 ML <x86@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"kvm @ vger . kernel . org" <kvm@...r.kernel.org>
Subject: Re: [PATCH] x86/bhi: avoid hardware mitigation for
 'spectre_bhi=vmexit'

On Fri, Sep 13, 2024 at 03:51:01PM +0000, Jon Kohler wrote:
> 
> 
> > On Sep 13, 2024, at 1:28 AM, Chao Gao <chao.gao@...el.com> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  CAUTION: External Email
> > 
> > |-------------------------------------------------------------------!
> > 
> > On Thu, Sep 12, 2024 at 09:24:40AM -0700, Pawan Gupta wrote:
> >> On Thu, Sep 12, 2024 at 03:44:38PM +0000, Jon Kohler wrote:
> >>>> It is only worth implementing the long sequence in VMEXIT_ONLY mode if it is
> >>>> significantly better than toggling the MSR.
> >>> 
> >>> Thanks for the pointer! I hadn’t seen that second sequence. I’ll do measurements on
> >>> three cases and come back with data from an SPR system.
> >>> 1. as-is (wrmsr on entry and exit)
> >>> 2. Short sequence (as a baseline)
> >>> 3. Long sequence
> >> 
> 
> Pawan,
> 
> Thanks for the pointer to the long sequence. I've tested it along with 
> Listing 3 (TSX Abort sequence) using KUT tscdeadline_immed test. TSX 
> abort sequence performs better unless BHI mitigation is off or 
> host/guest spec_ctrl values match, avoiding WRMSR toggling. Having the
> values match the DIS_S value is easier said than done across a fleet
> that is already using eIBRS heavily.
> 
> Test System:
> - Intel Xeon Gold 6442Y, microcode 0x2b0005c0
> - Linux 6.6.34 + patches, qemu 8.2
> - KVM Unit Tests @ latest (17f6f2fd) with tscdeadline_immed + edits:
> - Toggle spec ctrl before test in main()
> - Use cpu type SapphireRapids-v2
> 
> Test string:
> TESTNAME=vmexit_tscdeadline_immed TIMEOUT=90s MACHINE= ACCEL= taskset -c 26 ./x86/run x86/vmexit.flat \
> -smp 1 -cpu SapphireRapids-v2,+x2apic,+tsc-deadline -append tscdeadline_immed |grep tscdeadline
> 
> Test Results:
> 1. spectre_bhi=on, host spec_ctrl=1025, guest spec_ctrl=1: tscdeadline_immed 3878 (WRMSR toggling)
> 2. spectre_bhi=on, host spec_ctrl=1025, guest spec_ctrl=1025: tscdeadline_immed 3153 (no WRMSR toggling)
> 3. spectre_bhi=vmexit, BHB long sequence, host/guest spec_ctrl=1: tscdeadline_immed 3629 (still better than test 1, penalty only on exit)
> 4. spectre_bhi=vmexit, TSX abort sequence, host/guest spec_ctrl=1: tscdeadline_immed 3294 (best general purpose performance)

This looks promising.

> 5. spectre_bhi=vmexit, TSX abort sequence, host spec_ctrl=1, guest spec_ctrl=1025: tscdeadline_immed 4011 (needs optimization)

Once QEMU adds support for exposing BHI_CTRL, this is a very likely
scenario. To optimize this, host needs to have BHI_DIS_S set. We also need
to account for the case where some guests set BHI_DIS_S and others dont.

> In short, there is a significant speedup to be had here.
> 
> As for test 5, honest that is somewhat invalid because it would be
> dependent on the VMM user space showing BHI_CTRL.

Right.

> QEMU as an example does not do that, so even with latest qemu and latest
> kernel, guests will still use BHB loop even on SPR++ today, and they
> could use the TSX loop with this proposed change if the VMM exposes RTM
> feature.

I did not know that QEMU does not expose CPUID.BHI_CTRL. Chao, could you
please help getting this feature exposed in QEMU?

> I'm happy to post a V2 patch with my TSX changes, or take any other
> suggestions here.

With CPUID.BHI_CTRL exposed to guests, this:

> 2. spectre_bhi=on, host spec_ctrl=1025, guest spec_ctrl=1025: tscdeadline_immed 3153 (no WRMSR toggling)

will be the most common case, which is also the best performing. Isn't it
better to aim for this?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ