lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f0768546-a767-4d74-956e-b40128272a09@zytor.com>
Date: Fri, 16 Jan 2026 16:43:00 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: Dave Hansen <dave.hansen@...el.com>, "Xin Li (Intel)" <xin@...or.com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-doc@...r.kernel.org
Cc: pbonzini@...hat.com, seanjc@...gle.com, corbet@....net, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, luto@...nel.org, peterz@...radead.org,
        andrew.cooper3@...rix.com, chao.gao@...el.com, hch@...radead.org,
        sohil.mehta@...el.com
Subject: Re: [PATCH v9 08/22] KVM: VMX: Set FRED MSR intercepts

On 2026-01-16 11:49, Dave Hansen wrote:
> On 10/26/25 13:18, Xin Li (Intel) wrote:
>> Both MSR_IA32_FRED_RSP0 and MSR_IA32_FRED_SSP0 (aka MSR_IA32_PL0_SSP)
>> are dedicated for userspace event delivery, IOW they are NOT used in
>> any kernel event delivery and the execution of ERETS.  Thus KVM can
>> run safely with guest values in the two MSRs.  As a result, save and
>> restore of their guest values are deferred until vCPU context switch,
>> Host MSR_IA32_FRED_RSP0 is restored upon returning to userspace, and
>> Host MSR_IA32_PL0_SSP is managed with XRSTORS/XSAVES.
> 
> Is it worth making MSR_IA32_FRED_RSP0 special versus MSR_IA32_FRED_RSP[123]?
> 
> Is it needed because MSR_IA32_FRED_RSP0 is rewritten all the time as
> CPUs switch between threads? But MSR_IA32_FRED_RSP[123] are not
> frequently written?
> 
> I'd like to hear more about the motivation.

Because RSP[123] (and SSP[123]) are used by the kernel itself they are
context-switched by VTx automatically. This is necessary in order to preserve
the FRED architectural invariant that there should NEVER be a "gap" during
which it is unsafe to take an exception.

[RS]SP0 are not used while in kernel mode (since the only time we switch
*onto* the level 0 kernel stack is when entering from user space, logically
"CSL -1"), so during FRED architectural discussions it was agreed that it was
better to leave its management to kernel code, especially as KVM often does
not need to cross back into user space after each VMEXIT.

A nice side effect, but just that -- a side effect -- is that we don't need to
actually modify [RS]SP0 in the context-switch or task setup code.

The invariant that needs to be maintained is that IF the cached rsp0 value is
equal to the initial stack pointer for the running task, THEN the MSR MUST
match the cached value. A corollary of that is that if we modify either the
MSR or the cached value from an event that may have interrupted the kernel we
MUST make sure that this invariant cannot be inadvertently broken.

Setting the cached value to an invalid value (e.g. NULL/0) should work; there
shouldn't be an actual need to read the MSR unless I'm mistaken -- but I have
been working on other code today and so my cache for this specific code is not
100% up to date.

	-hpa


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ