linux-kernel - Re: [PATCH v2] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALMp9eSyVWGS2HQVwwwViE6S_uweiOiFucqa3keuoUjNz9rKqA@mail.gmail.com>
Date:   Wed, 25 Jan 2023 16:58:40 -0800
From:   Jim Mattson <jmattson@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        seanjc@...gle.com, stable@...r.kernel.org
Subject: Re: [PATCH v2] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID

On Wed, Jan 25, 2023 at 2:44 PM Paolo Bonzini <pbonzini@...hat.com> wrote:
>
> On 1/25/23 23:09, Jim Mattson wrote:
> > The topology leaves returned by KVM_GET_SUPPORTED_CPUID *for over a
> > decade* have been passed through unmodified from the host. They have
> > never made sense for KVM_SET_CPUID2, with the unlikely exception of a
> > whole-host VM.
>
> True, unfortunately people have not read the nonexistent documentation
> and they are:
>
> 1) rarely adjusting correctly all of 0xB, 0x1F and 0x8000001E;
>
> 2) never bounding CPUID[EAX=0].EAX to a known CPUID leaf, resulting for
> example in inconsistencies between 0xB and 0x1F.
>
> *But* (2) should not be needed unless you care about maintaining
> homogeneous CPUID within a VM migration pool.  For something like
> kvmtool, having to do (2) would be a workaround for the bug that this
> patch fixes.

Maybe we should just populate up to leaf 3. :-)

> > Our VMM populates the topology of the guest CPUID table on its own, as
> > any VMM must. However, it uses the host topology (which
> > KVM_GET_SUPPORTED_CPUID has been providing pass-through *for over a
> > decade*) to see if the requested guest topology is possible.
>
> Ok, thanks; this is useful to know.
>
> > Changing a long-established ABI in a way that breaks userspace
> > applications is a bad practice. I didn't think we, as a community, did
> > that. I didn't realize that we were only catering to open source
> > implementations here.
>
> We aren't.  But the open source implementations provide some guidance as
> to how the API is being used in the wild, and what the pitfalls are.
>
> You wrote it yourself: any VMM must either populate the topology on its
> own, or possibly fill it with zeros.  Returning a value that is
> extremely unlikely to be used is worse in pretty much every way (apart
> from not breaking your VMM, of course).

I've complained about this particular ioctl more than I can remember.
This is just one of its many problems.

> With a total of six known users (QEMU, crosvm, kvmtool, firecracker,
> rust-vmm, and the Google VMM), KVM is damned if it reverts the patch and
> damned if it doesn't.  There is a tension between fixing the one VMM
> that was using KVM_GET_SUPPORTED_CPUID correctly and now breaks loudly,
> and fixing 3-4 that were silently broken and are now fixed.  I will
> probably send a patch to crosvm, though.
>
> The VMM being _proprietary_ doesn't really matter, however it does
> matter to me that it is not _public_: it is only used within Google, and
> the breakage is neither hard to fix in the VMM nor hard to temporarily
> avoid by reverting the patch in the Google kernel.

Sadly, there isn't a single kernel involved. People running our VMM on
their desktops are going to be impacted as soon as this patch hits
that distro. (I don't know if I can say which distro that is.) So, now
we have to get the VMM folks to urgently accommodate this change and
get a new distribution out.