[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <448e25a3-ff1f-4038-933c-66417cd6b7b4@citrix.com>
Date: Fri, 21 Nov 2025 15:21:21 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Amit Shah <amit@...nel.org>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, x86@...nel.org, linux-doc@...r.kernel.org,
amit.shah@....com, thomas.lendacky@....com, bp@...en8.de,
tglx@...utronix.de, peterz@...radead.org, jpoimboe@...nel.org,
pawan.kumar.gupta@...ux.intel.com, corbet@....net, mingo@...hat.com,
dave.hansen@...ux.intel.com, hpa@...or.com, pbonzini@...hat.com,
daniel.sneddon@...ux.intel.com, kai.huang@...el.com, sandipan.das@....com,
boris.ostrovsky@...cle.com, Babu.Moger@....com, david.kaplan@....com,
dwmw@...zon.co.uk
Subject: Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests
On 21/11/2025 2:58 pm, Sean Christopherson wrote:
> On Fri, Nov 21, 2025, Andrew Cooper wrote:
>> On 20/11/2025 8:11 pm, Sean Christopherson wrote:
>>> The emulation requirements are not limited to shadow paging. From the APM:
>>>
>>> The ERAPS feature eliminates the need to execute CALL instructions to clear
>>> the return address predictor in most cases. On processors that support ERAPS,
>>> return addresses from CALL instructions executed in host mode are not used in
>>> guest mode, and vice versa. Additionally, the return address predictor is
>>> cleared in all cases when the TLB is implicitly invalidated (see Section 5.5.3 “TLB
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> Management,” on page 159) and in the following cases:
>>>
>>> • MOV CR3 instruction
>>> • INVPCID other than single address invalidation (operation type 0)
>> I already asked AMD for clarification here. AIUI, INVLPGB should be
>> included in this list, and that begs the question what else is missed
>> from the documentation.
>>
>>> Yes, KVM only intercepts MOV CR3 and INVPCID when NPT is disabled (or INVPCID is
>>> unsupported per guest CPUID), but that is an implementation detail, the instructions
>>> are still reachable via emulator, and KVM needs to emulate implicit TLB flush
>>> behavior.
>> The Implicit flushes cover CR0.PG, CR4.{PSE,PGE,PCIDE,PKE}, SMI, RSM,
>> writes to MTRR MSR, #INIT, A20M, and "other model specific MSRs, see NDA
>> docs".
>>
>> The final part is very unhelpful in practice, and necessitates a RAS
>> flush on any emulated WRMSR, unless AMD are going to start handing out
>> the multi-coloured documents...
> Does Xen actually emulate guest TLB flushes on all emulated WRMSRs?
Not currently. I need to reassess in light of this conversation.
> A RAS flush seems like small peanuts compared to a TLB flush.
Workload dependent, but in the common case, I'd expect so.
>
>> The really fastpath MSRs are unintercepted and won't suffer this overhead.
> Heh, if an unintercepted MSR is on the "naughty list", wouldn't that break shadow
> paging schemes that rely on intercepting architectural TLB flushes to synchronize
> shadow PTEs?
Hmm. Yes it would, if (and only if) the OS is aware of and depending on
the WRMSR for TLB flushing.
I doubt OSes are depending on model specific side effects such as this,
but we have no way to know for sure.
~Andrew
Powered by blists - more mailing lists