lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZgVDvCePGwKWv0wd@chao-email>
Date: Thu, 28 Mar 2024 18:17:32 +0800
From: Chao Gao <chao.gao@...el.com>
To: Xiaoyao Li <xiaoyao.li@...el.com>
CC: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>, "Zhang, Tina" <tina.zhang@...el.com>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai" <kai.huang@...el.com>,
	"Chen, Bo2" <chen.bo@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
	"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Aktas, Erdem"
	<erdemaktas@...gle.com>, "isaku.yamahata@...ux.intel.com"
	<isaku.yamahata@...ux.intel.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "Yuan, Hang" <hang.yuan@...el.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH v19 059/130] KVM: x86/tdp_mmu: Don't zap private pages
 for unsupported cases

On Thu, Mar 28, 2024 at 11:40:27AM +0800, Xiaoyao Li wrote:
>On 3/28/2024 11:04 AM, Edgecombe, Rick P wrote:
>> On Thu, 2024-03-28 at 09:30 +0800, Xiaoyao Li wrote:
>> > > The current ABI of KVM_EXIT_X86_RDMSR when TDs are created is nothing. So I don't see how this
>> > > is
>> > > any kind of ABI break. If you agree we shouldn't try to support MTRRs, do you have a different
>> > > exit
>> > > reason or behavior in mind?
>> > 
>> > Just return error on TDVMCALL of RDMSR/WRMSR on TD's access of MTRR MSRs.
>> 
>> MTRR appears to be configured to be type "Fixed" in the TDX module. So the guest could expect to be
>> able to use it and be surprised by a #GP.
>> 
>>          {
>>            "MSB": "12",
>>            "LSB": "12",
>>            "Field Size": "1",
>>            "Field Name": "MTRR",
>>            "Configuration Details": null,
>>            "Bit or Field Virtualization Type": "Fixed",
>>            "Virtualization Details": "0x1"
>>          },
>> 
>> If KVM does not support MTRRs in TDX, then it has to return the error somewhere or pretend to
>> support it (do nothing but not return an error). Returning an error to the guest would be making up
>> arch behavior, and to a lesser degree so would ignoring the WRMSR.
>
>The root cause is that it's a bad design of TDX to make MTRR fixed1. When
>guest reads MTRR CPUID as 1 while getting #VE on MTRR MSRs, it already breaks
>the architectural behavior. (MAC faces the similar issue , MCA is fixed1 as

I won't say #VE on MTRR MSRs breaks anything. Writes to other MSRs (e.g.
TSC_DEADLINE MSR) also lead to #VE. If KVM can emulate the MSR accesses, #VE
should be fine.

The problem is: MTRR CPUID feature is fixed 1 while KVM/QEMU doesn't know how
to virtualize MTRR especially given that KVM cannot control the memory type in
secure-EPT entries.

>well while accessing MCA related MSRs gets #VE. This is why TDX is going to
>fix them by introducing new feature and make them configurable)
>
>> So that is why I lean towards
>> returning to userspace and giving the VMM the option to ignore it, return an error to the guest or
>> show an error to the user.
>
>"show an error to the user" doesn't help at all. Because user cannot fix it,
>nor does QEMU.

The key point isn't who can fix/emulate MTRR MSRs. It is just KVM doesn't know
how to handle this situation and ask userspace for help.

Whether or how userspace can handle the MSR writes isn't KVM's problem. It may be
better if KVM can tell userspace exactly in which cases KVM will exit to
userspace. But there is no such an infrastructure.

An example is: in KVM CET series, we find it is complex for KVM instruction
emulator to emulate control flow instructions when CET is enabled. The
suggestion is also to punt to userspace (w/o any indication to userspace that
KVM would do this).

>
>> If KVM can't support the behavior, better to get an actual error in
>> userspace than a mysterious guest hang, right?
>What behavior do you mean?
>
>> Outside of what kind of exit it is, do you object to the general plan to punt to userspace?
>> 
>> Since this is a TDX specific limitation, I guess there is KVM_EXIT_TDX_VMCALL as a general category
>> of TDVMCALLs that cannot be handled by KVM.

Using KVM_EXIT_TDX_VMCALL looks fine.

We need to explain why MTRR MSRs are handled in this way unlike other MSRs.

It is better if KVM can tell userspace that MTRR virtualization isn't supported
by KVM for TDs. Then userspace should resolve the conflict between KVM and TDX
module on MTRR. But to report MTRR as unsupported, we need to make
GET_SUPPORTED_CPUID a vm-scope ioctl. I am not sure if it is worth the effort.


>
>I just don't see any difference between handling it in KVM and handling it in
>userspace: either a) return error to guest or b) ignore the WRMSR.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ