linux-kernel - Re: [PATCH v3 10/15] KVM: x86: add fields to struct kvm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZfRhu0GVjWeAAJMB@google.com>
Date: Fri, 15 Mar 2024 07:56:59 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Michael Roth <michael.roth@....com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org, kvm@...r.kernel.org, 
	aik@....com, pankaj.gupta@....com
Subject: Re: [PATCH v3 10/15] KVM: x86: add fields to struct kvm_arch for CoCo features

On Thu, Mar 14, 2024, Michael Roth wrote:
> On Thu, Mar 14, 2024 at 03:56:27PM -0700, Sean Christopherson wrote:
> > On Thu, Mar 14, 2024, Michael Roth wrote:
> > > On Wed, Mar 13, 2024 at 09:49:52PM -0500, Michael Roth wrote:
> > > > I've been trying to get SNP running on top of these patches and hit and
> > > > issue with these due to fpstate_set_confidential() being done during
> > > > svm_vcpu_create(), so when QEMU tries to sync FPU state prior to calling
> > > > SNP_LAUNCH_FINISH it errors out. I think the same would happen with
> > > > SEV-ES as well.
> > > > Maybe fpstate_set_confidential() should be relocated to SEV_LAUNCH_FINISH
> > > > site as part of these patches?
> > > 
> > > Talked to Tom a bit about this and that might not make much sense unless
> > > we actually want to add some code to sync that FPU state into the VMSA

Is manually copying required for register state?  If so, manually copying everything
seems like the way to go, otherwise we'll end up with a confusing ABI where a
rather arbitrary set of bits are (not) configurable by userspace.

> > > prior to encryption/measurement. Otherwise, it might as well be set to
> > > confidential as soon as vCPU is created.
> > > 
> > > And if userspace wants to write FPU register state that will not actually
> > > become part of the guest state, it probably does make sense to return an
> > > error for new VM types and leave it to userspace to deal with
> > > special-casing that vs. the other ioctls like SET_REGS/SREGS/etc.
> > 
> > Won't regs and sregs suffer the same fate?  That might not matter _today_ for
> > "real" VMs, but it would be a blocking issue for selftests, which need to stuff
> > state to jumpstart vCPUs.
> 
> SET_REGS/SREGS and the others only throw an error when
> vcpu->arch.guest_state_protected gets set, which doesn't happen until

Ah, I misread the diff and didn't see the existing check on fpstate_is_confidential().

Side topic, I could have sworn KVM didn't allocate the guest fpstate for SEV-ES,
but git blame says otherwise.  Avoiding that allocation would have been an argument
for immediately marking the fpstate confidential.

That said, any reason not to free the state when the fpstate is marked confidential?

> sev_launch_update_vmsa(). So in those cases userspace is still able to sync
> additional/non-reset state prior initial launch. It's just XSAVE/XSAVE2 that
> are a bit more restrictive because they check fpstate_is_confidential()
> instead, which gets set during vCPU creation.
> 
> Somewhat related, but just noticed that KVM_SET_FPU also relies on
> fpstate_is_confidential() but still silently returns 0 with this series.
> Seems like it should be handled the same way as XSAVE/XSAVE2, whatever we
> end up doing.

+1

Also, I think a less confusing and more robust way to deal with the new VM types
would be to condition only the return code on whether or not the VM has protected
state, e.g.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9d670a45aea4..0e245738d4c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5606,10 +5606,6 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
 static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
                                         u8 *state, unsigned int size)
 {
-       if (vcpu->kvm->arch.has_protected_state &&
-           fpstate_is_confidential(&vcpu->arch.guest_fpu))
-               return -EINVAL;
-
        /*
         * Only copy state for features that are enabled for the guest.  The
         * state itself isn't problematic, but setting bits in the header for
@@ -5626,7 +5622,7 @@ static int kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
                             XFEATURE_MASK_FPSSE;
 
        if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
-               return 0;
+               return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
 
        fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, state, size,
                                       supported_xcr0, vcpu->arch.pkru);