[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3999aadf-92a8-43f9-8d9d-84aa47e7d1ae@linux.intel.com>
Date: Fri, 31 May 2024 09:22:51 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>
Cc: Michael Roth <michael.roth@....com>, kvm@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
tglx@...utronix.de, mingo@...hat.com, jroedel@...e.de,
thomas.lendacky@....com, hpa@...or.com, ardb@...nel.org,
vkuznets@...hat.com, jmattson@...gle.com, luto@...nel.org,
dave.hansen@...ux.intel.com, slp@...hat.com, pgonda@...gle.com,
peterz@...radead.org, srinivas.pandruvada@...ux.intel.com,
rientjes@...gle.com, dovmurik@...ux.ibm.com, tobin@....com, bp@...en8.de,
vbabka@...e.cz, kirill@...temov.name, ak@...ux.intel.com,
tony.luck@...el.com, sathyanarayanan.kuppuswamy@...ux.intel.com,
alpergun@...gle.com, jarkko@...nel.org, ashish.kalra@....com,
nikunj.dadhania@....com, pankaj.gupta@....com, liam.merwick@...cle.com,
Brijesh Singh <brijesh.singh@....com>,
Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [PATCH v15 09/20] KVM: SEV: Add support to handle MSR based Page
State Change VMGEXIT
On 5/30/2024 4:02 AM, Sean Christopherson wrote:
> On Tue, May 28, 2024, Paolo Bonzini wrote:
>> On Mon, May 27, 2024 at 2:26 PM Binbin Wu <binbin.wu@...ux.intel.com> wrote:
>>>> It seems like TDX should be able to do something similar by limiting the
>>>> size of each KVM_HC_MAP_GPA_RANGE to TDX_MAP_GPA_MAX_LEN, and then
>>>> returning TDG_VP_VMCALL_RETRY to guest if the original size was greater
>>>> than TDX_MAP_GPA_MAX_LEN. But at that point you're effectively done with
>>>> the entire request and can return to guest, so it actually seems a little
>>>> more straightforward than the SNP case above. E.g. TDX has a 1:1 mapping
>>>> between TDG_VP_VMCALL_MAP_GPA and KVM_HC_MAP_GPA_RANGE events. (And even
>>>> similar names :))
>>>>
>>>> So doesn't seem like there's a good reason to expose any of these
>>>> throttling details to userspace,
>> I think userspace should never be worried about throttling. I would
>> say it's up to the guest to split the GPA into multiple ranges,
> I agree in principle, but in practice I can understand not wanting to split up
> the conversion in the guest due to the additional overhead of the world switches.
>
>> but that's not how arch/x86/coco/tdx/tdx.c is implemented so instead we can
>> do the split in KVM instead. It can be a module parameter or VM attribute,
>> establishing the size that will be processed in a single TDVMCALL.
> Is it just interrupts that are problematic for conversions? I assume so, because
> I can't think of anything else where telling the guest to retry would be appropriate
> and useful.
The concern was the lockup detection in guest.
>
> If so, KVM shouldn't need to unconditionally restrict the size for a single
> TDVMCALL, KVM just needs to ensure interrupts are handled soonish. To do that,
> KVM could use a much smaller chunk size, e.g. 64KiB (completely made up number),
> and keep processing the TDVMCALL as long as there is no interrupt pending.
> Hopefully that would obviate the need for a tunable.
Thanks for the suggestion.
By this way, interrupt can be injected to guest in time and the lockup
detection should not be a problem.
About the chunk size, if it is too small, it will increase the cost of
kernel/userspace context switches.
Maybe 2MB?
Powered by blists - more mailing lists