linux-kernel - RE: [patch 5/6] x86/fpu: Provide fpu_update_guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52768B84B24664A756CDF2C48C779@BN9PR11MB5276.namprd11.prod.outlook.com>
Date:   Thu, 16 Dec 2021 13:00:43 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Wang, Wei W" <wei.w.wang@...el.com>,
        "quintela@...hat.com" <quintela@...hat.com>
CC:     LKML <linux-kernel@...r.kernel.org>,
        "Dr. David Alan Gilbert" <dgilbert@...hat.com>,
        Jing Liu <jing2.liu@...ux.intel.com>,
        "Zhong, Yang" <yang.zhong@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "Nakajima, Jun" <jun.nakajima@...el.com>,
        "Zeng, Guang" <guang.zeng@...el.com>
Subject: RE: [patch 5/6] x86/fpu: Provide fpu_update_guest_xcr0/xfd()

Hi, Paolo/Thomas,

Any comment on following opens? In general doing static buffer 
expansion definitely simplifies the logic, but still need your help to 
finalize its impact on the overall design. 😊

Thanks
Kevin

> From: Tian, Kevin
> Sent: Thursday, December 16, 2021 1:36 PM
> 
> > From: Tian, Kevin
> > Sent: Thursday, December 16, 2021 9:00 AM
> > >
> > > Off-list, Thomas mentioned doing it even at vCPU creation as long as the
> > > prctl has been called.  That is also okay and even simpler.
> >
> > Make sense. It also avoids the #GP thing in the emulation path if just due
> > to reallocation error.
> >
> > We'll follow this direction to do a quick update/test.
> >
> 
> After some study there are three opens which we'd like to sync here. Once
> they are closed we'll send out a new version very soon (hopefully tomorrow).
> 
> 1) Have a full expanded buffer at vCPU creation
> 
> There are two options.
> 
> One is to directly allocate a big-enough buffer upon guest_fpu::perm in
> fpu_alloc_guest_fpstate(). There is no reallocation per-se thus most changes
> in this series are not required.
> 
> The other is to keep the reallocation concept (thus all previous patches are
> kept) and still call a wrapper around __xfd_enable_feature() even at vCPU
> creation (e.g. right after fpu_init_guest_permissions()). This matches the
> fpu core assumption that fpstate for xfd features are dynamically allocated,
> though the actual calling point may not be dynamic. This also allows us
> to exploit doing expansion in KVM_SET_CPUID2 (see next).
> 
> 2) Do expansion at vCPU creation or KVM_ SET_CPUID2?
> 
> If the reallocation concept is still kept, then we feel doing expansion in
> KVM_SET_CPUID2 makes slightly more sense. There is no functional
> difference between two options since the guest is not running at this
> point. And in general Qemu should set prctl according to the cpuid bits.
> But since anyway we still need to check guest cpuid against guest perm in
> KVM_SET_CPUID2, it reads clearer to expand the buffer only after this
> check is passed.
> 
> If this approach is agreed, then we may replace the helper functions in
> this patch with a new one:
> 
> /*
>  * fpu_update_guest_perm_features - Enable xfeatures according to guest
> perm
>  * @guest_fpu:		Pointer to the guest FPU container
>  *
>  * Enable all dynamic xfeatures according to guest perm. Invoked if the
>  * caller wants to conservatively expand fpstate buffer instead of waiting
>  * until a given feature is accessed.
>  *
>  * Return: 0 on success, error code otherwise
>  */
> +int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu)
> +{
> +	u64 expand;
> +
> +	lockdep_assert_preemption_enabled();
> +
> +	if (!IS_ENABLED(CONFIG_X86_64))
> +		return 0;
> +
> +	expand = guest_fpu->perm & ~guest_fpu->xfeatures;
> +	if (!expand)
> +		return 0;
> +
> +	return __xfd_enable_feature(expand, guest_fpu);
> +}
> +EXPORT_SYMBOL_GPL(fpu_update_guest_features);
> 
> 3) Always disable interception of disable after 1st interception?
> 
> Once we choose to have a full expanded buffer before guest runs, the
> point of intercepting WRMSR(IA32_XFD) becomes less obvious since
> no dynamic reallocation is required.
> 
> One option is to always disable WRMSR interception once
> KVM_SET_CPUID2 succeeds, with the cost of one RDMSR per vm-exit.
> But doing so affects legacy OS which even has no XFD logic at all.
> 
> The other option is to continue the current policy i.e. disable write
> emulation only after the 1st interception of setting XFD to a non-zero
> value. Then the RDMSR cost is added only for guest which supports XFD.
> 
> In either case we need another helper to update guest_fpu->fpstate->xfd
> as required in the restore path. For the 2nd option we further want
> to update MSR if guest_fpu is currently in use:
> 
> +void xfd_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
> +{
> +	fpregs_lock();
> +	guest_fpu->fpstae->xfd = xfd;
> +	if (guest_fpu->fpstate->in_use)
> +		xfd_update_state(guest_fpu->fpstate);
> +	fpregs_unlock();
> +}
> 
> Thoughts?
> --
> p.s. currently we're working on a quick prototype based on:
>   - Expand buffer in KVM_SET_CPUID2
>   - Disable write emulation after first interception
> to check any oversight.
> 
> Thanks
> Kevin