linux-kernel - Re: [PATCH] KVM: TDX: Allow userspace to return errors to guest for MAPGPA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <af8bbddc-fcf5-460b-9a6f-1418a0748f37@intel.com>
Date: Thu, 15 Jan 2026 15:47:24 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Sagi Shahar <sagis@...gle.com>, Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
 Dave Hansen <dave.hansen@...ux.intel.com>, Kiryl Shutsemau <kas@...nel.org>,
 Rick Edgecombe <rick.p.edgecombe@...el.com>,
 Thomas Gleixner <tglx@...nel.org>, Borislav Petkov <bp@...en8.de>,
 "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org, kvm@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-coco@...ts.linux.dev,
 Vishal Annapurve <vannapurve@...gle.com>
Subject: Re: [PATCH] KVM: TDX: Allow userspace to return errors to guest for
 MAPGPA

On 1/15/2026 9:21 AM, Sagi Shahar wrote:
> On Wed, Jan 14, 2026 at 9:57 AM Sean Christopherson <seanjc@...gle.com> wrote:
>>
>> On Wed, Jan 14, 2026, Xiaoyao Li wrote:
>>> On 1/14/2026 8:30 AM, Sagi Shahar wrote:
>>>> From: Vishal Annapurve <vannapurve@...gle.com>
>>>>
>>>> MAPGPA request from TDX VMs gets split into chunks by KVM using a loop
>>>> of userspace exits until the complete range is handled.
>>>>
>>>> In some cases userspace VMM might decide to break the MAPGPA operation
>>>> and continue it later. For example: in the case of intrahost migration
>>>> userspace might decide to continue the MAPGPA operation after the
>>>> migrration is completed
>>
>> migration
>>
>>>> Allow userspace to signal to TDX guests that the MAPGPA operation should
>>>> be retried the next time the guest is scheduled.
>>
>> To Xiaoyao's point, changes like this either need new uAPI, or a detailed
>> explanation in the changelog of why such uAPI isn't deemed necessary.
>>
>>>> Signed-off-by: Vishal Annapurve <vannapurve@...gle.com>
>>>> Co-developed-by: Sagi Shahar <sagis@...gle.com>
>>>> Signed-off-by: Sagi Shahar <sagis@...gle.com>
>>>> ---
>>>>    arch/x86/kvm/vmx/tdx.c | 8 +++++++-
>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
>>>> index 2d7a4d52ccfb..3244064b1a04 100644
>>>> --- a/arch/x86/kvm/vmx/tdx.c
>>>> +++ b/arch/x86/kvm/vmx/tdx.c
>>>> @@ -1189,7 +1189,13 @@ static int tdx_complete_vmcall_map_gpa(struct kvm_vcpu *vcpu)
>>>>      struct vcpu_tdx *tdx = to_tdx(vcpu);
>>>>      if (vcpu->run->hypercall.ret) {
>>>> -           tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND);
>>>> +           if (vcpu->run->hypercall.ret == -EBUSY)
>>>> +                   tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_RETRY);
>>>> +           else if (vcpu->run->hypercall.ret == -EINVAL)
>>>> +                   tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND);
>>>> +           else
>>>> +                   return -EINVAL;
>>>
>>> It's incorrect to return -EINVAL here.
>>
>> It's not incorrect, just potentially a breaking change.
>>
>>> The -EINVAL will eventually be
>>> returned to userspace for the VCPU_RUN ioctl. It certainly breaks userspace.
>>
>> It _might_ break userspace.  It certainly changes KVM's ABI, but if no userspace
>> actually utilizes the existing ABI, then userspace hasn't been broken.
>>
>> And unless I'm missing something, QEMU _still_ doesn't set hypercall.ret.  E.g.
>> see this code in __tdx_map_gpa().
>>
>>          /*
>>           * In principle this should have been -KVM_ENOSYS, but userspace (QEMU <=9.2)
>>           * assumed that vcpu->run->hypercall.ret is never changed by KVM and thus that
>>           * it was always zero on KVM_EXIT_HYPERCALL.  Since KVM is now overwriting
>>           * vcpu->run->hypercall.ret, ensuring that it is zero to not break QEMU.
>>           */
>>          tdx->vcpu.run->hypercall.ret = 0;
>>
>> AFAICT, QEMU kills the VM if anything goes wrong.
>>
>> So while I initially had the exact same reaction of "this is a breaking change
>> and needs to be opt-in", we might actually be able to get away with just making
>> the change (assuming no other VMMs care, or are willing to change themselves).
> 
> Is there a better source of truth for whether QEMU uses hypercall.ret
> or just point to this comment in the commit message.

No version of QEMU touches hypercall.ret, from the source code.

I suggest not mentioning the comment, because it only tells QEMU expects 
vcpu->run->hypercall.ret to be 0 on KVM_EXIT_HYPERCALL. What matters is 
QEMU never sets vcpu->run->hypercall.ret to a non-zero value after 
handling KVM_EXIT_HYPERCALL. I think you can just describe the fact that 
QEMU never set vcpu->run->hypercall.ret to a non-zero value in the 
commit message.