linux-kernel - Re: [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <08107331-34b9-b33d-67ee-300f216341e0@redhat.com>
Date:   Mon, 13 Dec 2021 11:43:34 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Yang Zhong <yang.zhong@...el.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com
Cc:     seanjc@...gle.com, jun.nakajima@...el.com, kevin.tian@...el.com,
        jing2.liu@...ux.intel.com, jing2.liu@...el.com
Subject: Re: [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl

On 12/13/21 11:10, Thomas Gleixner wrote:
> On Fri, Dec 10 2021 at 17:30, Paolo Bonzini wrote:
>> On 12/8/21 01:03, Yang Zhong wrote:
>>> +static int kvm_vcpu_ioctl_x86_set_xsave2(struct kvm_vcpu *vcpu, u8 *state)
>>> +{
>>> +	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
>>> +		return 0;
>>> +
>>> +	return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu, state,
>>> +					      supported_xcr0, &vcpu->arch.pkru);
>>> +}
>>> +
>>
>> I think fpu_copy_uabi_to_guest_fpstate (and therefore
>> copy_uabi_from_kernel_to_xstate) needs to check that the size is
>> compatible with the components in the input.
> 
> fpu_copy_uabi_to_guest_fpstate() expects that the input buffer is
> correctly sized. We surely can add a size check there.

fpu_copy_guest_fpstate_to_uabi is more problematic because that one
writes memory.  For fpu_copy_uabi_to_guest_fpstate, we know the input
buffer size from the components and we can use it to do a properly-sized
memdup_user.

For fpu_copy_guest_fpstate_to_uabi we can just decide that KVM_GET_XSAVE
will only save up to the first 4K.  Something like the following might
actually be good for 5.16-rc; right now, header.xfeatures might lead
userspace into reading uninitialized or unmapped memory:

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d28829403ed0..69609b8c3887 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1138,8 +1138,10 @@ void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate,
  	struct xstate_header header;
  	unsigned int zerofrom;
  	u64 mask;
+	u64 size;
  	int i;
  
+	size = to->left;
  	memset(&header, 0, sizeof(header));
  	header.xfeatures = xsave->header.xfeatures;
  
@@ -1186,7 +1188,20 @@ void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate,
  	/* Copy xsave->i387.sw_reserved */
  	membuf_write(&to, xstate_fx_sw_bytes, sizeof(xsave->i387.sw_reserved));
  
-	/* Copy the user space relevant state of @xsave->header */
+	/*
+	 * Copy the user space relevant state of @xsave->header.
+	 * If not all features fit in the buffer, drop them from the
+	 * saved state so that userspace does not read uninitialized or
+	 * unmapped memory.
+	 */
+	mask = fpstate->user_xfeatures;
+	for_each_extended_xfeature(i, mask) {
+		if (xstate_offsets[i] + xstate_size[i] > size) {
+			header.xfeatures &= BIT(i) - 1;
+			mask &= BIT(i) - 1;
+			break;
+		}
+	}
  	membuf_write(&to, &header, sizeof(header));
  
  	zerofrom = offsetof(struct xregs_state, extended_state_area);
@@ -1197,7 +1212,6 @@ void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate,
  	 * but there is no state to copy from in the compacted
  	 * init_fpstate. The gap tracking will zero these states.
  	 */
-	mask = fpstate->user_xfeatures;
  
  	for_each_extended_xfeature(i, mask) {
  		/*



>> Also, IIUC the size of the AMX state will vary in different processors.
>>    Is this correct?  If so, this should be handled already by
>> KVM_GET/SET_XSAVE2 and therefore should be part of the
>> arch/x86/kernel/fpu APIs.  In the future we want to support migrating a
>> "small AMX" host to a "large AMX" host; and also migrating from a "large
>> AMX" host to a "small AMX" host if the guest CPUID is compatible with
>> the destination of the migration.
> 
> How is that supposed to work? If the AMX state size differs then the
> hosts are not compatible.

I replied with some more questions later.  Basically it depends on how
Intel will define palettes that aren't part of the first implementation
of AMX.

Paolo