lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05cf3e20-6508-48c3-9e4c-9f2a2a719012@redhat.com>
Date: Tue, 10 Sep 2024 19:52:41 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>, seanjc@...gle.com,
 kvm@...r.kernel.org
Cc: kai.huang@...el.com, isaku.yamahata@...il.com,
 tony.lindgren@...ux.intel.com, xiaoyao.li@...el.com,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 25/25] KVM: x86: Add CPUID bits missing from
 KVM_GET_SUPPORTED_CPUID

On 8/13/24 00:48, Rick Edgecombe wrote:
> Originally, the plan was to filter the directly configurable CPUID bits
> exposed by KVM_TDX_CAPABILITIES, and the final configured bit values
> provided by KVM_TDX_GET_CPUID. However, several issues were found with
> this. Both the filtering done with KVM_TDX_CAPABILITIES and
> KVM_TDX_GET_CPUID had the issue that the get_supported_cpuid() provided
> default values instead of supported masks for multi-bit fields (i.e. those
> encoding a multi-bit number).
> 
> For KVM_TDX_CAPABILITIES, there was also the problem of bits that are
> actually supported by KVM, but missing from get_supported_cpuid() for one
> reason or another. These include X86_FEATURE_MWAIT, X86_FEATURE_HT and
> X86_FEATURE_TSC_DEADLINE_TIMER. This is currently worked around in QEMU by
> adjusting which features are expected. Some of these are going to be added
> to get_supported_cpuid(), and that is probably the right long term fix.

There are several cases here:

- MWAIT is hidden because it's hard to virtualize its C-state parameters

- HT is hidden because it depends on the topology, and cannot be added 
blindly.

- TSC_DEADLINE_TIMER is queried with KVM_CHECK_EXTENSION for historical 
reasons

There are basically two kinds of userspace:

- those that fetch KVM_GET_SUPPORED_CPUID and pass it blindly to 
KVM_SET_CPUID2.  These mostly work, though they may miss a feature or 
three (e.g. the TSC deadline timer).

- those that know each bit and make an informed decision on what to 
enable; for those, KVM_GET_SUPPORTED_CPUID is just guidance.

Because of this, KVM_GET_SUPPORTED_CPUID doesn't return bits that are 
one; it returns a mix of:

- maximum supported values (e.g. CPUID[7,0].EAX)

- values from the host (e.g. FMS or model name)

- supported features

It's an awfully defined API but it is easier to use than it sounds (some 
of the quirks are being documented in 
Documentation/virt/kvm/x86/errata.rst and 
Documentation/virt/kvm/x86/api.rst).  The idea is that, if userspace 
manages individual CPUID bits, it already knows what can be one anyway.

This is the kind of API that we need to present for TDX, even if the 
details on how to get the supported CPUID are different.  Not because 
it's a great API, but rather because it's a known API.

The difference between this and KVM_GET_SUPPORTED_CPUID are small, but 
the main one is X86_FEATURE_HYPERVISOR (I am not sure whether to make it 
different with respect to X86_FEATURE_TSC_DEADLINE_TIMER; leaning 
towards no).

We may also need a second ioctl specifically to return the fixed-1 bits. 
  Asking Xiaoyao for input with regard to what he'd like to have in QEMU.

> +	entry = kvm_find_cpuid_entry2((*cpuid)->entries, (*cpuid)->nent, 0x0, 0);
> +	if (WARN_ON(!entry))
> +		goto err;
> +	/* Fixup of maximum basic leaf */
> +	entry->eax |= 0x000000FF;
> +
> +	entry = kvm_find_cpuid_entry2((*cpuid)->entries, (*cpuid)->nent, 0x1, 0);
> +	if (WARN_ON(!entry))
> +		goto err;
> +	/* Fixup of FMS */
> +	entry->eax |= 0x0fff3fff;
> +	/* Fixup of maximum logical processors per package */
> +	entry->ebx |= 0x00ff0000;
> +

I see now why you could blindly AND things in patch 24.

However, the right mode of operation is still to pick manually which 
bits to AND.

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ