linux-kernel - Re: [PATCH v3 1/4] x86/kvm: add boot parameter for adding vcpu-id bits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eab3fe21-e209-a1af-3b7b-ed831cf1990d@suse.com>
Date:   Thu, 18 Nov 2021 16:19:15 +0100
From:   Juergen Gross <jgross@...e.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, x86@...nel.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, Jonathan Corbet <corbet@....net>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v3 1/4] x86/kvm: add boot parameter for adding vcpu-id
 bits

On 18.11.21 16:09, Sean Christopherson wrote:
> On Thu, Nov 18, 2021, Juergen Gross wrote:
>> On 18.11.21 00:46, Sean Christopherson wrote:
>>> On Wed, Nov 17, 2021, Juergen Gross wrote:
>>>> On 16.11.21 15:10, Juergen Gross wrote:
>>>>> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
>>>>> via a #define in a header file.
>>>>>
>>>>> In order to support higher vcpu-ids without generally increasing the
>>>>> memory consumption of guests on the host (some guest structures contain
>>>>> arrays sized by KVM_MAX_VCPU_IDS) add a boot parameter for adding some
>>>>> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
>>>>> constructed via bit-wise concatenation of socket-id, core-id, etc.
>>>>> As those ids maximum values are not always a power of 2, the vcpu-ids
>>>>> are sparse.
>>>>>
>>>>> The additional number of bits needed is basically the number of
>>>>> topology levels with a non-power-of-2 maximum value, excluding the top
>>>>> most level.
>>>>>
>>>>> The default value of the new parameter will be 2 in order to support
>>>>> today's possible topologies. The special value of -1 will use the
>>>>> number of bits needed for a guest with the current host's topology.
>>>>>
>>>>> Calculating the maximum vcpu-id dynamically requires to allocate the
>>>>> arrays using KVM_MAX_VCPU_IDS as the size dynamically.
>>>>>
>>>>> Signed-of-by: Juergen Gross <jgross@...e.com>
>>>>
>>>> Just thought about vcpu-ids a little bit more.
>>>>
>>>> It would be possible to replace the topology games completely by an
>>>> arbitrary rather high vcpu-id limit (65536?) and to allocate the memory
>>>> depending on the max vcpu-id just as needed.
>>>>
>>>> Right now the only vcpu-id dependent memory is for the ioapic consisting
>>>> of a vcpu-id indexed bitmap and a vcpu-id indexed byte array (vectors).
>>>>
>>>> We could start with a minimal size when setting up an ioapic and extend
>>>> the areas in case a new vcpu created would introduce a vcpu-id outside
>>>> the currently allocated memory. Both arrays are protected by the ioapic
>>>> specific lock (at least I couldn't spot any unprotected usage when
>>>> looking briefly into the code), so reallocating those arrays shouldn't
>>>> be hard. In case of ENOMEM the related vcpu creation would just fail.
>>>>
>>>> Thoughts?
>>>
>>> Why not have userspace state the max vcpu_id it intends to creates on a per-VM
>>> basis?  Same end result, but doesn't require the complexity of reallocating the
>>> I/O APIC stuff.
>>>
>>
>> And if the userspace doesn't do it (like today)?
> 
> Similar to my comments in patch 4, KVM's current limits could be used as the
> defaults, and any use case wanting to go beyond that would need an updated
> userspace.  Exceeding those limits today doesn't work, so there's no ABI breakage
> by requiring a userspace change.

Hmm, nice idea. Will look into it.

> Or again, this could be a Kconfig knob, though that feels a bit weird in this case.
> But it might make sense if it can be tied to something in the kernel's config?

Having a Kconfig knob for an absolute upper bound of vcpus should
be fine. If someone doesn't like the capability to explicitly let
qemu create very large VMs, he/she can still set that upper bound
to the normal KVM_MAX_VCPUS value.

Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)