lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5jwnuhsfer2eovcran7zfyjh7jjrc4zdjgimuipympjnznq7gr@fxdpszsihgup>
Date: Thu, 22 Jan 2026 01:20:48 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Jim Mattson <jmattson@...gle.com>
Cc: Sean Christopherson <seanjc@...gle.com>, 
	Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>, 
	Shuah Khan <shuah@...nel.org>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v2 1/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to
 either hPAT or gPAT

On Thu, Jan 15, 2026 at 03:21:40PM -0800, Jim Mattson wrote:
> When the vCPU is in guest mode with nested NPT enabled, guest accesses to
> IA32_PAT are redirected to the gPAT register, which is stored in
> vmcb02->save.g_pat.
> 
> Non-guest accesses (e.g. from userspace) to IA32_PAT are always redirected
> to hPAT, which is stored in vcpu->arch.pat.
> 
> This is architected behavior. It also makes it possible to restore a new
> checkpoint on an old kernel with reasonable semantics. After the restore,
> gPAT will be lost, and L2 will run on L1's PAT. Note that the old kernel
> would have always run L2 on L1's PAT.

This creates a difference in MSR_IA32_CR_PAT handling with nested SVM
and nested VMX, right?

AFAICT, reading MSR_IA32_CR_PAT while an L2 VMX guest is running will
read L2's PAT. With this change, the same scenario on SVM will read L1's
PAT. We can claim that it was always L1's PAT though, because we've
always been running L2 with L1's PAT.

I am just raising this in case it's a problem to have different behavior
for SVM and VMX. I understand that we need to do this to be able to
save/restore L1's PAT with SVM in guest mode and maintain backward
compatibility.

IIUC VMX does not have the same issue because host and guest PAT are
both in vmcs12 and are both saved/restored appropriately.

> 
> Signed-off-by: Jim Mattson <jmattson@...gle.com>
> ---
>  arch/x86/kvm/svm/svm.c | 31 ++++++++++++++++++++++++-------
>  1 file changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 7041498a8091..3f8581adf0c1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2846,6 +2846,13 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_AMD64_DE_CFG:
>  		msr_info->data = svm->msr_decfg;
>  		break;
> +	case MSR_IA32_CR_PAT:
> +		if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
> +		    nested_npt_enabled(svm))
> +			msr_info->data = svm->vmcb->save.g_pat; /* gPAT */
> +		else
> +			msr_info->data = vcpu->arch.pat; /* hPAT */
> +		break;
>  	default:
>  		return kvm_get_msr_common(vcpu, msr_info);
>  	}
> @@ -2929,14 +2936,24 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  
>  		break;
>  	case MSR_IA32_CR_PAT:
> -		ret = kvm_set_msr_common(vcpu, msr);
> -		if (ret)
> -			break;
> +		if (!kvm_pat_valid(data))
> +			return 1;
>  
> -		svm->vmcb01.ptr->save.g_pat = data;

This is a bug fix, L2 is now able to alter L1's PAT, right?

> -		if (is_guest_mode(vcpu))
> -			nested_vmcb02_compute_g_pat(svm);
> -		vmcb_mark_dirty(svm->vmcb, VMCB_NPT);

This looks like another bug fix, it seems like we'll update vmcb01 but
clear the clean bit in vmcb02 if we're in guest mode.

Probably worth calling these out (and CC:stable, Fixes:.., etc)?

We probably need a comment here explaining the gPAT vs hPAT case, I
don't think it's super obvious why we only redirect L2's own accesses to
its PAT but not userspace's.

> +		if (!msr->host_initiated && is_guest_mode(vcpu) &&
> +		    nested_npt_enabled(svm)) {
> +			svm->vmcb->save.g_pat = data; /* gPAT */
> +			vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> +		} else {
> +			vcpu->arch.pat = data; /* hPAT */

Should we call kvm_set_msr_common() here instead of setting
vcpu->arch.pat? The kvm_pat_valid() call would be redundant but that
should be fine. My main concern is if kvm_set_msr_common() gains more
logic for MSR_IA32_CR_PAT that isn't reflected here. Probably unlikely
tho..

> +			if (npt_enabled) {
> +				svm->vmcb01.ptr->save.g_pat = data;
> +				vmcb_mark_dirty(svm->vmcb01.ptr, VMCB_NPT);
> +				if (is_guest_mode(vcpu)) {

IIUC (with the fix you mentioned) this is because L1 and L2 share the
PAT if nested NPT is disabled, and if we're already in guest mode then
we also need to update vmcb02 (as it was already created based on vmcb01
with the old PAT). Probably worth a comment.

> +					svm->vmcb->save.g_pat = data;
> +					vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> +				}
> +			}
> +		}
>  		break;
>  	case MSR_IA32_SPEC_CTRL:
>  		if (!msr->host_initiated &&
> -- 
> 2.52.0.457.g6b5491de43-goog
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ