lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAhR5DE=ypkYwqEGEJBZs5A2N9OCVaL_9Jxi5YN5X7rNpKSZTw@mail.gmail.com>
Date: Wed, 14 Jan 2026 19:21:01 -0600
From: Sagi Shahar <sagis@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Xiaoyao Li <xiaoyao.li@...el.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, Kiryl Shutsemau <kas@...nel.org>, 
	Rick Edgecombe <rick.p.edgecombe@...el.com>, Thomas Gleixner <tglx@...nel.org>, 
	Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-coco@...ts.linux.dev, 
	Vishal Annapurve <vannapurve@...gle.com>
Subject: Re: [PATCH] KVM: TDX: Allow userspace to return errors to guest for MAPGPA

On Wed, Jan 14, 2026 at 9:57 AM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Wed, Jan 14, 2026, Xiaoyao Li wrote:
> > On 1/14/2026 8:30 AM, Sagi Shahar wrote:
> > > From: Vishal Annapurve <vannapurve@...gle.com>
> > >
> > > MAPGPA request from TDX VMs gets split into chunks by KVM using a loop
> > > of userspace exits until the complete range is handled.
> > >
> > > In some cases userspace VMM might decide to break the MAPGPA operation
> > > and continue it later. For example: in the case of intrahost migration
> > > userspace might decide to continue the MAPGPA operation after the
> > > migrration is completed
>
> migration
>
> > > Allow userspace to signal to TDX guests that the MAPGPA operation should
> > > be retried the next time the guest is scheduled.
>
> To Xiaoyao's point, changes like this either need new uAPI, or a detailed
> explanation in the changelog of why such uAPI isn't deemed necessary.
>
> > > Signed-off-by: Vishal Annapurve <vannapurve@...gle.com>
> > > Co-developed-by: Sagi Shahar <sagis@...gle.com>
> > > Signed-off-by: Sagi Shahar <sagis@...gle.com>
> > > ---
> > >   arch/x86/kvm/vmx/tdx.c | 8 +++++++-
> > >   1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > > index 2d7a4d52ccfb..3244064b1a04 100644
> > > --- a/arch/x86/kvm/vmx/tdx.c
> > > +++ b/arch/x86/kvm/vmx/tdx.c
> > > @@ -1189,7 +1189,13 @@ static int tdx_complete_vmcall_map_gpa(struct kvm_vcpu *vcpu)
> > >     struct vcpu_tdx *tdx = to_tdx(vcpu);
> > >     if (vcpu->run->hypercall.ret) {
> > > -           tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND);
> > > +           if (vcpu->run->hypercall.ret == -EBUSY)
> > > +                   tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_RETRY);
> > > +           else if (vcpu->run->hypercall.ret == -EINVAL)
> > > +                   tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND);
> > > +           else
> > > +                   return -EINVAL;
> >
> > It's incorrect to return -EINVAL here.
>
> It's not incorrect, just potentially a breaking change.
>
> > The -EINVAL will eventually be
> > returned to userspace for the VCPU_RUN ioctl. It certainly breaks userspace.
>
> It _might_ break userspace.  It certainly changes KVM's ABI, but if no userspace
> actually utilizes the existing ABI, then userspace hasn't been broken.
>
> And unless I'm missing something, QEMU _still_ doesn't set hypercall.ret.  E.g.
> see this code in __tdx_map_gpa().
>
>         /*
>          * In principle this should have been -KVM_ENOSYS, but userspace (QEMU <=9.2)
>          * assumed that vcpu->run->hypercall.ret is never changed by KVM and thus that
>          * it was always zero on KVM_EXIT_HYPERCALL.  Since KVM is now overwriting
>          * vcpu->run->hypercall.ret, ensuring that it is zero to not break QEMU.
>          */
>         tdx->vcpu.run->hypercall.ret = 0;
>
> AFAICT, QEMU kills the VM if anything goes wrong.
>
> So while I initially had the exact same reaction of "this is a breaking change
> and needs to be opt-in", we might actually be able to get away with just making
> the change (assuming no other VMMs care, or are willing to change themselves).

Is there a better source of truth for whether QEMU uses hypercall.ret
or just point to this comment in the commit message.

>
> > So it needs to be
> >
> >       if (vcpu->run->hypercall.ret == -EBUSY)
> >               tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_RETRY);
> >       else
> >               tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND);
>
> No, because assuming everything except -EBUSY translates to
> TDVMCALL_STATUS_INVALID_OPERAND paints KVM back into the same corner its already
> in.  What I care most about is eliminating KVM's assumption that a non-zero
> hypercall.ret means TDVMCALL_STATUS_INVALID_OPERAND.
>
> For the new ABI, I see two options:
>
>  1. Translate -errno as done in this patch.
>  2. Propagate hypercall.ret directly to the TDVMCALL return code, i.e. let
>     userspace set any return code it wants.
>
> #1 has the downside of needing KVM changes and new uAPI every time a new return
> code is supported.
>
> #2 has the downside of preventing KVM from establishing its own ABI around the
> return code, and making the return code vendor specific.  E.g. if KVM ever wanted
> to do something in response to -EBUSY beyond propagating the error to the guest,
> then we can't reasonably do that with #2.
>
> Whatever we do, I want to change snp_complete_psc_msr() and snp_complete_one_psc()
> in the same patch, so that whatever ABI we establish is common to TDX and SNP.
>
> See also https://lore.kernel.org/all/Zn8YM-s0TRUk-6T-@google.com.
>
> > But I'm not sure if such change breaks the userspace ABI that if needs to be
> > opted-in.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ