lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200304154538.GB21662@linux.intel.com>
Date:   Wed, 4 Mar 2020 07:45:38 -0800
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Yang Weijiang <weijiang.yang@...el.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        pbonzini@...hat.com, jmattson@...gle.com,
        yu.c.zhang@...ux.intel.com
Subject: Re: [PATCH v9 7/7] KVM: X86: Add user-space access interface for CET
 MSRs

On Wed, Mar 04, 2020 at 11:18:15PM +0800, Yang Weijiang wrote:
> On Tue, Mar 03, 2020 at 02:28:27PM -0800, Sean Christopherson wrote:
> > > @@ -1886,6 +1976,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > >  		else
> > >  			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> > >  		break;
> > > +	case MSR_IA32_S_CET:
> > > +		if (!cet_ctl_access_allowed(vcpu, msr_info))
> > > +			return 1;
> > > +		msr_info->data = vmcs_readl(GUEST_S_CET);
> > > +		break;
> > > +	case MSR_IA32_INT_SSP_TAB:
> > > +		if (!cet_ssp_access_allowed(vcpu, msr_info))
> > > +			return 1;
> > > +		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> > > +		break;
> > > +	case MSR_IA32_U_CET:
> > > +		if (!cet_ctl_access_allowed(vcpu, msr_info))
> > > +			return 1;
> > > +		rdmsrl(MSR_IA32_U_CET, msr_info->data);
> > > +		break;
> > > +	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > > +		if (!cet_ssp_access_allowed(vcpu, msr_info))
> > > +			return 1;
> > > +		rdmsrl(msr_info->index, msr_info->data);
> > 
> > Ugh, thought of another problem.  If a SoftIRQ runs after an IRQ it can
> > load the kernel FPU state.  So for all the XSAVES MSRs we'll need a helper
> > similar to vmx_write_guest_kernel_gs_base(), except XSAVES has to be even
> > more restrictive and disable IRQs entirely.  E.g.
> > 
> > static void vmx_get_xsave_msr(struct msr_data *msr_info)
> > {
> > 	local_irq_disable();
> > 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
> > 		switch_fpu_return();
> > 	rdmsrl(msr_info->index, msr_info->data);
> > 	local_irq_enable();
> In this case, would SoftIRQ destroy vcpu->arch.guest.fpu states which
> had been restored to XSAVES MSRs that we were accessing?

Doing kernel_fpu_begin() from a softirq would swap guest.fpu out of the
CPUs registers.  It sets TIF_NEED_FPU_LOAD to mark the tasks has needing to
reload its FPU state prior to returning to userspace.  So it doesn't
destroy it per se.  The result is that KVM would read/write the CET MSRs
after they're loaded from the kernel's FPU state instead of reading the
MSRs loaded from the guest's FPU state.

> So should we restore
> guest.fpu or? In previous patch, we have restored guest.fpu before
> access the XSAVES MSRs.

There are three different FPU states:

  - kernel
  - userspace
  - guest

RDMSR/WRMSR for CET MSRs need to run while the guest.fpu state is loaded
into the CPU registers[1].  At the beginning of the syscall from userspace,
i.e. the vCPU ioctl(), the task's FPU state[2] holds userspace FPU state.
Patch 6/7 swaps out the userspace state and loads the guest state.

But, if a softirq runs between kvm_load_guest_fpu() and now, and executes
kernel_fpu_begin(), it will swap the guest state (out of CPU registers)
and load the kernel state (into PCU registers).  The actual RDMSR/WRMSR
needs to ensure the guest state is still loaded by checking and handling
TIF_NEED_FPU_LOAD.

[1] An alternative to doing switch_fpu_return() on TIF_NEED_FPU_LOAD would
    be to calculate the offset into the xsave and read/write directly
    to/from memory.  But IMO that's unnecessary complexity as the guest's
    fpu state still needs to be reloaded before re-entering the guest, e.g.
    if vmx_{g,s}et_msr() is invoked on {RD,WR}MSR intercept, while loading
    or saving MSR state from userspace isn't a hot path.

[2] I worded this to say "task's FPU state" because it's also possible the
    CPU registers hold kernel state at the beginning of the vCPU ioctl(),
    e.g. because of softirq.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ