lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZseL6kZ436EGuS9H@google.com>
Date: Thu, 22 Aug 2024 12:05:14 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Daniil Tatianin <d-tatianin@...dex-team.ru>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	yc-core@...dex-team.ru
Subject: Re: [PATCH] kvm_host: bump KVM_MAX_IRQ_ROUTE to 128k

KVM: for the scope

On Thu, Mar 21, 2024, Daniil Tatianin wrote:
> We would like to be able to create large VMs (up to 224 vCPUs atm) with
> up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU
> for optimal performance (as well as config & control interrupts per
> card). Adding in extra virtio-blk controllers with a queue per vCPU (up
> to 192 disks) yields a total of about ~100k IRQ routes, rounded up to
> 128k for extra headroom and flexibility.
> 
> The current limit of 4096 was set in 2018 and is too low for modern
> demands. It also seems to be there for no good reason as routes are
> allocated lazily by the kernel anyway (depending on the largest GSI
> requested by the VM).
> 
> Signed-off-by: Daniil Tatianin <d-tatianin@...dex-team.ru>
> ---
>  include/linux/kvm_host.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 48f31dcd318a..10a141add2a8 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm,
>  
>  #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  
> -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */
> +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */

I am not comfortable simply bumping the max.  Yeah, it's allocated on-demand, but
if my math is correct, the means a max of ~8MiB for the table, plus another 8MiB
for the tnries.   And when handling KVM_SET_GSI_ROUTING, KVM will have 2 tables
(old and new), and another 4MiB for duplicating the userspace array. Those allocations
are accounted, but that's still a lot of potential thrash.

And KVM's handling is also grossly inefficient, e.g. reallocating everything just
to change one routing entry is awful.  Maybe painfully slow updates are fine for
your use case, but some OSes have a bad habit of round-robining IRQ destinations
on a regular basis.

So it might be "free" in the sense that it costs you nothing to get your use case
working, but there's very much a cost for KVM in the form of technical debt that
someone will have to eventually pay for.

I don't have any concrete thoughts on how to make KVM's implementation less sucky,
but I do think we need to give it some attention before increasing the maximum
number of IRQ routes, especially before increasing it by 32x.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ