[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52768B84B24664A756CDF2C48C779@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 16 Dec 2021 13:00:43 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Wang, Wei W" <wei.w.wang@...el.com>,
"quintela@...hat.com" <quintela@...hat.com>
CC: LKML <linux-kernel@...r.kernel.org>,
"Dr. David Alan Gilbert" <dgilbert@...hat.com>,
Jing Liu <jing2.liu@...ux.intel.com>,
"Zhong, Yang" <yang.zhong@...el.com>,
"x86@...nel.org" <x86@...nel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Christopherson,, Sean" <seanjc@...gle.com>,
"Nakajima, Jun" <jun.nakajima@...el.com>,
"Zeng, Guang" <guang.zeng@...el.com>
Subject: RE: [patch 5/6] x86/fpu: Provide fpu_update_guest_xcr0/xfd()
Hi, Paolo/Thomas,
Any comment on following opens? In general doing static buffer
expansion definitely simplifies the logic, but still need your help to
finalize its impact on the overall design. 😊
Thanks
Kevin
> From: Tian, Kevin
> Sent: Thursday, December 16, 2021 1:36 PM
>
> > From: Tian, Kevin
> > Sent: Thursday, December 16, 2021 9:00 AM
> > >
> > > Off-list, Thomas mentioned doing it even at vCPU creation as long as the
> > > prctl has been called. That is also okay and even simpler.
> >
> > Make sense. It also avoids the #GP thing in the emulation path if just due
> > to reallocation error.
> >
> > We'll follow this direction to do a quick update/test.
> >
>
> After some study there are three opens which we'd like to sync here. Once
> they are closed we'll send out a new version very soon (hopefully tomorrow).
>
> 1) Have a full expanded buffer at vCPU creation
>
> There are two options.
>
> One is to directly allocate a big-enough buffer upon guest_fpu::perm in
> fpu_alloc_guest_fpstate(). There is no reallocation per-se thus most changes
> in this series are not required.
>
> The other is to keep the reallocation concept (thus all previous patches are
> kept) and still call a wrapper around __xfd_enable_feature() even at vCPU
> creation (e.g. right after fpu_init_guest_permissions()). This matches the
> fpu core assumption that fpstate for xfd features are dynamically allocated,
> though the actual calling point may not be dynamic. This also allows us
> to exploit doing expansion in KVM_SET_CPUID2 (see next).
>
> 2) Do expansion at vCPU creation or KVM_ SET_CPUID2?
>
> If the reallocation concept is still kept, then we feel doing expansion in
> KVM_SET_CPUID2 makes slightly more sense. There is no functional
> difference between two options since the guest is not running at this
> point. And in general Qemu should set prctl according to the cpuid bits.
> But since anyway we still need to check guest cpuid against guest perm in
> KVM_SET_CPUID2, it reads clearer to expand the buffer only after this
> check is passed.
>
> If this approach is agreed, then we may replace the helper functions in
> this patch with a new one:
>
> /*
> * fpu_update_guest_perm_features - Enable xfeatures according to guest
> perm
> * @guest_fpu: Pointer to the guest FPU container
> *
> * Enable all dynamic xfeatures according to guest perm. Invoked if the
> * caller wants to conservatively expand fpstate buffer instead of waiting
> * until a given feature is accessed.
> *
> * Return: 0 on success, error code otherwise
> */
> +int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu)
> +{
> + u64 expand;
> +
> + lockdep_assert_preemption_enabled();
> +
> + if (!IS_ENABLED(CONFIG_X86_64))
> + return 0;
> +
> + expand = guest_fpu->perm & ~guest_fpu->xfeatures;
> + if (!expand)
> + return 0;
> +
> + return __xfd_enable_feature(expand, guest_fpu);
> +}
> +EXPORT_SYMBOL_GPL(fpu_update_guest_features);
>
> 3) Always disable interception of disable after 1st interception?
>
> Once we choose to have a full expanded buffer before guest runs, the
> point of intercepting WRMSR(IA32_XFD) becomes less obvious since
> no dynamic reallocation is required.
>
> One option is to always disable WRMSR interception once
> KVM_SET_CPUID2 succeeds, with the cost of one RDMSR per vm-exit.
> But doing so affects legacy OS which even has no XFD logic at all.
>
> The other option is to continue the current policy i.e. disable write
> emulation only after the 1st interception of setting XFD to a non-zero
> value. Then the RDMSR cost is added only for guest which supports XFD.
>
> In either case we need another helper to update guest_fpu->fpstate->xfd
> as required in the restore path. For the 2nd option we further want
> to update MSR if guest_fpu is currently in use:
>
> +void xfd_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
> +{
> + fpregs_lock();
> + guest_fpu->fpstae->xfd = xfd;
> + if (guest_fpu->fpstate->in_use)
> + xfd_update_state(guest_fpu->fpstate);
> + fpregs_unlock();
> +}
>
> Thoughts?
> --
> p.s. currently we're working on a quick prototype based on:
> - Expand buffer in KVM_SET_CPUID2
> - Disable write emulation after first interception
> to check any oversight.
>
> Thanks
> Kevin
Powered by blists - more mailing lists