[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CO1PR11MB5107FBC68DBA6877E390A633912AA@CO1PR11MB5107.namprd11.prod.outlook.com>
Date: Fri, 30 Jun 2023 15:05:54 +0000
From: "Neiger, Gil" <gil.neiger@...el.com>
To: "Yang, Weijiang" <weijiang.yang@...el.com>,
"Gao, Chao" <chao.gao@...el.com>
CC: "Christopherson,, Sean" <seanjc@...gle.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"rppt@...nel.org" <rppt@...nel.org>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"john.allen@....com" <john.allen@....com>
Subject: RE: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception
classification
Intel will not produce any CPU with CET that does not enumerate IA32_VMX_BASIC[56] as 1.
One can check that bit before injecting a #CP with error code, but it should not be necessary if CET is enumerated.
Of course, if the KVM may run as a guest of another VMM/hypervisor, it may be that the virtual CPU in which KVM operates may enumerate CET but clear the bit in IA32_VMX_BASIC.
- Gil
-----Original Message-----
From: Yang, Weijiang <weijiang.yang@...el.com>
Sent: Friday, June 30, 2023 05:05
To: Gao, Chao <chao.gao@...el.com>
Cc: Christopherson,, Sean <seanjc@...gle.com>; pbonzini@...hat.com; kvm@...r.kernel.org; linux-kernel@...r.kernel.org; peterz@...radead.org; rppt@...nel.org; binbin.wu@...ux.intel.com; Edgecombe, Rick P <rick.p.edgecombe@...el.com>; john.allen@....com; Neiger, Gil <gil.neiger@...el.com>
Subject: Re: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception classification
On 6/30/2023 6:27 PM, Chao Gao wrote:
> On Fri, Jun 30, 2023 at 05:34:28PM +0800, Yang, Weijiang wrote:
>> On 6/17/2023 2:57 AM, Sean Christopherson wrote:
>>> On Fri, Jun 16, 2023, Weijiang Yang wrote:
>>>> On 6/16/2023 7:58 AM, Sean Christopherson wrote:
>>>>> On Thu, Jun 08, 2023, Weijiang Yang wrote:
>>>>>> On 6/6/2023 5:08 PM, Chao Gao wrote:
>>>>>>> On Thu, May 11, 2023 at 12:08:46AM -0400, Yang Weijiang wrote:
>>>>>>>> Add handling for Control Protection (#CP) exceptions(vector 21).
>>>>>>>> The new vector is introduced for Intel's Control-Flow
>>>>>>>> Enforcement Technology (CET) relevant violation cases.
>>>>>>>>
>>>>>>>> Although #CP belongs contributory exception class, but the
>>>>>>>> actual effect is conditional on CET being exposed to guest. If
>>>>>>>> CET is not available to guest, #CP falls back to
>>>>>>>> non-contributory and doesn't have an error code.
>>>>>>> This sounds weird. is this the hardware behavior? If yes, could
>>>>>>> you point us to where this behavior is documented?
>>>>>> It's not SDM documented behavior.
>>>>> The #CP behavior needs to be documented. Please pester whoever
>>>>> you need to in order to make that happen.
>>>> Do you mean documentation for #CP as an generic exception or the
>>>> behavior in KVM as this patch shows?
>>> As I pointed out two *years* ago, this entry in the SDM
>>>
>>> — The field's deliver-error-code bit (bit 11) is 1 if each of the following
>>> holds: (1) the interruption type is hardware exception; (2) bit 0
>>> (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
>>> (3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
>>> indicates one of the following exceptions: #DF (vector 8), #TS (10),
>>> #NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).
>>>
>>> needs to read something like
>>>
>>> — The field's deliver-error-code bit (bit 11) is 1 if each of the following
>>> holds: (1) the interruption type is hardware exception; (2) bit 0
>>> (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
>>> (3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
>>> indicates one of the following exceptions: #DF (vector 8), #TS (10),
>>> #NP (11), #SS (12), #GP (13), #PF (14), #AC (17), or #CP
>>> (21)[1]
>>>
>>> [1] #CP has an error code if and only if IA32_VMX_CR4_FIXED1 enumerates
>>> support for the 1-setting of CR4.CET.
>> Hi, Sean,
>>
>> I sent above change request to Gil(added in cc), but he shared
>> different opinion on this issue:
>>
>>
>> "It is the case that all CET-capable parts enumerate IA32_VMX_BASIC[56] as 1.
>>
>> However, there were earlier parts without CET that enumerated
>> IA32_VMX_BASIC[56] as 0.
>>
>> On those parts, an attempt to inject an exception with vector 21
>> (#CP) with an error code would fail.
>>
>> (Injection of exception 21 with no error code would be allowed.)
>>
>> It may make things clearer if we document the statement above (all
>> CET-capable parts enumerate IA32_VMX_BASIC[56] as 1).
>>
>> I will see if we can update future revisions of the SDM to clarify this."
>>
>>
>> Then if this is the case, kvm needs to check IA32_VMX_BASIC[56]
>> before inject exception to nested VM.
> And KVM can hide CET from guests if IA32_VMX_BASIC[56] is 0.
Yes, this scratch patch didn't cover cross-check with CET enabling, thanks!
>
>> And this patch could be removed, instead need another patch like below:
>>
>> diff --git a/arch/x86/include/asm/msr-index.h
>> b/arch/x86/include/asm/msr-index.h
>> index ad35355ee43e..6b33aacc8587 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -1076,6 +1076,7 @@
>> #define VMX_BASIC_MEM_TYPE_MASK 0x003c000000000000LLU
>> #define VMX_BASIC_MEM_TYPE_WB 6LLU
>> #define VMX_BASIC_INOUT 0x0040000000000000LLU
>> +#define VMX_BASIC_CHECK_ERRCODE 0x0140000000000000LLU
>>
>> /* Resctrl MSRs: */
>> /* - Intel: */
>> diff --git a/arch/x86/kvm/vmx/capabilities.h
>> b/arch/x86/kvm/vmx/capabilities.h index 85cffeae7f10..4b1ed4dc03bc
>> 100644
>> --- a/arch/x86/kvm/vmx/capabilities.h
>> +++ b/arch/x86/kvm/vmx/capabilities.h
>> @@ -79,6 +79,11 @@ static inline bool cpu_has_vmx_basic_inout(void)
>> return (((u64)vmcs_config.basic_cap << 32) &
>> VMX_BASIC_INOUT);
>> }
>>
>> +static inline bool cpu_has_vmx_basic_check_errcode(void)
>> +{
>> + return (((u64)vmcs_config.basic_cap << 32) &
>> VMX_BASIC_CHECK_ERRCODE);
>> +}
>> +
>> static inline bool cpu_has_virtual_nmis(void)
>> {
>> return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS
>> && diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 78524daa2cb2..92aa4fc3d233 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -1227,9 +1227,9 @@ static int vmx_restore_vmx_basic(struct
>> vcpu_vmx *vmx,
>> u64 data)
>> {
>> const u64 feature_and_reserved =
>> /* feature (except bit 48; see below) */
>> - BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) |
>> + BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | BIT_ULL(56) |
>> /* reserved */
>> - BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56);
>> + BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 57);
>> u64 vmx_basic = vmcs_config.nested.basic;
>>
>> if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved))
>> @@ -2873,7 +2873,8 @@ static int
>> nested_check_vm_entry_controls(struct
>> kvm_vcpu *vcpu,
>> should_have_error_code =
>> intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
>> x86_exception_has_error_code(vector);
>> - if (CC(has_error_code != should_have_error_code))
>> + if (!cpu_has_vmx_basic_check_errcode() &&
> We can skip computing should_have_error_code. and we should check if
> IA32_VMX_BASIC[56] is set for this vCPU (i.e. in
> vmx->nested.msrs.basic) rather than host/kvm capability.
Oops, I confused myself, yes, need to reshape the code a bit and use msrs.basic
to check the bit status, thanks!
>
>> + CC(has_error_code != should_have_error_code))
>> return -EINVAL;
>>
>> /* VM-entry exception error code */ @@ -6986,6 +6987,8 @@
>> static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs)
>>
>> if (cpu_has_vmx_basic_inout())
>> msrs->basic |= VMX_BASIC_INOUT;
>> + if (cpu_has_vmx_basic_check_errcode())
>> + msrs->basic |= VMX_BASIC_CHECK_ERRCODE;
>> }
>>
>> static void nested_vmx_setup_cr_fixed(struct nested_vmx_msrs *msrs)
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
>> d70f2e94b187..95c0eab7805c 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -2748,7 +2748,7 @@ static int setup_vmcs_config(struct vmcs_config
>> *vmcs_conf,
>> rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
>>
>> vmcs_conf->size = vmx_msr_high & 0x1fff;
>> - vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
>> + vmcs_conf->basic_cap = vmx_msr_high & ~0x7fff;
>>
>> vmcs_conf->revision_id = vmx_msr_low;
>>
>>
Powered by blists - more mailing lists