[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db6a57eb67620d1b41d702baf16142669cc26e5c.camel@amd.com>
Date: Mon, 24 Nov 2025 16:15:47 +0000
From: "Shah, Amit" <Amit.Shah@....com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "corbet@....net" <corbet@....net>, "pawan.kumar.gupta@...ux.intel.com"
<pawan.kumar.gupta@...ux.intel.com>, "kai.huang@...el.com"
<kai.huang@...el.com>, "jpoimboe@...nel.org" <jpoimboe@...nel.org>,
"andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"daniel.sneddon@...ux.intel.com" <daniel.sneddon@...ux.intel.com>, "Lendacky,
Thomas" <Thomas.Lendacky@....com>, "kvm@...r.kernel.org"
<kvm@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "Moger,
Babu" <Babu.Moger@....com>, "Das1, Sandipan" <Sandipan.Das@....com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>, "hpa@...or.com"
<hpa@...or.com>, "peterz@...radead.org" <peterz@...radead.org>,
"bp@...en8.de" <bp@...en8.de>, "boris.ostrovsky@...cle.com"
<boris.ostrovsky@...cle.com>, "Kaplan, David" <David.Kaplan@....com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests
On Thu, 2025-11-20 at 12:11 -0800, Sean Christopherson wrote:
>
> > 2. Hosts that disable NPT: the ERAPS feature flushes the RSB
> > entries on
> > several conditions, including CR3 updates. Emulating hardware
> > behaviour on RSB flushes is not worth the effort for NPT=off
> > case,
> > nor is it worthwhile to enumerate and emulate every trigger the
> > hardware uses to flush RSB entries. Instead of identifying and
> > replicating RSB flushes that hardware would have performed had
> > NPT
> > been ON, do not let NPT=off VMs use the ERAPS features.
>
> The emulation requirements are not limited to shadow paging. From
> the APM:
>
> The ERAPS feature eliminates the need to execute CALL instructions
> to clear
> the return address predictor in most cases. On processors that
> support ERAPS,
> return addresses from CALL instructions executed in host mode are
> not used in
> guest mode, and vice versa. Additionally, the return address
> predictor is
> cleared in all cases when the TLB is implicitly invalidated (see
> Section 5.5.3 “TLB
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Management,” on page 159) and in the following cases:
>
> • MOV CR3 instruction
> • INVPCID other than single address invalidation (operation type 0)
>
> Yes, KVM only intercepts MOV CR3 and INVPCID when NPT is disabled (or
> INVPCID is
> unsupported per guest CPUID), but that is an implementation detail,
> the instructions
> are still reachable via emulator, and KVM needs to emulate implicit
> TLB flush
> behavior.
>
> So punting on emulating RAP clearing because it's too hard is not an
> option. And
> AFAICT, it's not even that hard.
I didn't mean on punting it in the "it's too hard" sense, but in the
sense that we don't know all the details of when hardware decides to do
a flush; and even if triggers are mentioned in this APM today, future
changes to microcode or APM docs might reveal more triggers that we
need to emulate and account for. There's no way to track such changes,
so my thinking is that we should be conservative and not assume
anything.
> The changelog also needs to include the architectural behavior,
> otherwise "is not
> worth the effort" is even more subjective since there's no
> documentation of what
> the effort would actually be.
> As for emulating the RAP clears, a clever idea is to piggyback and
> alias dirty
> tracking for VCPU_EXREG_CR3, as VCPU_EXREG_ERAPS. I.e. mark the vCPU
> as needing
> a RAP clear if CR3 is written to, and then let common x86 also set
> VCPU_EXREG_ERAPS
> as needed, e.g. when handling INVPCID.
> Compile tested only at this point, but this?
I'll run this on my hardware and check for anything obvious.
Since you're also saying the npt=on and npt=off cases aren't very
different, I'll check with the hardware architects to confirm we can
indeed go with the RAP clearing triggers as presented.
Amit
Powered by blists - more mailing lists