lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 14 Oct 2022 04:03:38 +0000
From:   "Yao, Yuan" <yuan.yao@...el.com>
To:     "Bae, Chang Seok" <chang.seok.bae@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     "x86@...nel.org" <x86@...nel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: RE: [PATCH] x86/fpu: Remove dynamic features from xcomp_bv for
 init_fpstate

>-----Original Message-----
>From: Bae, Chang Seok <chang.seok.bae@...el.com>
>Sent: Friday, October 14, 2022 00:23
>To: Yao, Yuan <yuan.yao@...el.com>; Dave Hansen <dave.hansen@...ux.intel.com>; linux-kernel@...r.kernel.org
>Cc: x86@...nel.org; Hansen, Dave <dave.hansen@...el.com>; Thomas Gleixner <tglx@...utronix.de>
>Subject: Re: [PATCH] x86/fpu: Remove dynamic features from xcomp_bv for init_fpstate
>
>On 10/12/2022 8:35 PM, Yao, Yuan wrote:
>>
>> The reason is __copy_xstate_to_uabi_buf() copies data from &init_fpstate when the component
>> is not existed in the source kernel fpstate (here is the AMX tile component), but the
>> AMX TILE bit is removed from init_fpstate due to this patch, so the WARN is triggered and return
>> NULL which causes kernel NULL pointer dereference later.
>
>We have this in __copy_xstate_to_uabi_buf() [1]:
>
>	mask = fpstate->user_xfeatures;
>
>	for_each_extended_xfeature(i, mask) {
>	...
>	}
>
>And the KVM code seems to set dynamic features regardless of the buffer
>reallocation [2]:
>
>	vcpu->arch.guest_fpu.fpstate->user_xfeatures =
>		vcpu->arch.guest_supported_xcr0 | XFEATURE_MASK_FPSSE;
>
>The kernel code seems to be aware of this as fpstate_realloc() does [3]:
>
>	if (!guest_fpu)
>		newfps->user_xfeatures = curfps->user_xfeatures | xfeatures;
>
>But it updates the 'xfeature' bitmask for all:
>
>	newfps->xfeatures = curfps->xfeatures | xfeatures;
>
>So, I think we can do something like this here:
>
>diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>index c8340156bfd2..8ea7d0e95f1a 100644
>--- a/arch/x86/kernel/fpu/xstate.c
>+++ b/arch/x86/kernel/fpu/xstate.c
>@@ -1127,8 +1127,12 @@ void __copy_xstate_to_uabi_buf(struct membuf to,
>struct fpstate *fpstate,
>          * non-compacted format disabled features still occupy state space,
>          * but there is no state to copy from in the compacted
>          * init_fpstate. The gap tracking will zero these states.
>+        *
>+        * In the case of guest fpstate, this user_xfeatures does not
>+        * dynamically reflect the capacity of the XSAVE buffer but
>+        * xfeatures does. So AND them together.
>          */
>-       mask = fpstate->user_xfeatures;
>+       mask = fpstate->user_xfeatures & fpstate->xfeatures;

This doesn’t work.  At this point KVM already called fpstate_realloc() for guest
fpstate so the dynamic bits already set for the fpstate->xfeature: fpstate->xfeatures is 0x606e7 here.

Also the guest fpstate's xstate_bv (header.xfeature here) is 0 here, so all data will be read from
init_fpstate instead of guest fpstate, which triggered this for reading AMX TILE component.

To keep using init_fpstate as "fallback" for reading component data in above case, changes like
below should work, but this removes the valuable WARN_ON_ONCE from __raw_xsae_addr():

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index f9f45610c72f..1471de470b58 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -941,7 +941,7 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
                return NULL;

        if (cpu_feature_enabled(X86_FEATURE_XCOMPACTED)) {
-               if (WARN_ON_ONCE(!(xcomp_bv & BIT_ULL(xfeature_nr))))
+               if (!(xcomp_bv & BIT_ULL(xfeature_nr)))
                        return NULL;
        }

@@ -1049,7 +1049,10 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
                         void *init_xstate, unsigned int size)
 {
-       membuf_write(to, from_xstate ? xstate : init_xstate, size);
+       if ((from_xstate && xstate) || (!from_xstate && init_xstate))
+               membuf_write(to, from_xstate ? xstate : init_xstate, size);
+       else
+               membuf_zero(to, size);
 }

>
>Let me also test this by running KVM.
>
>Thanks,
>Chang
>
>[1]
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n1131
>[2]
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/cpuid.c#n346
>[3]
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n1448

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ