lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f6d99bc28fde0c48907991b6f67009430aea243.camel@redhat.com>
Date:   Wed, 31 Aug 2022 16:37:02 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        Li RongQing <lirongqing@...du.com>
Subject: Re: [PATCH 14/19] KVM: x86: Honor architectural behavior for
 aliased 8-bit APIC IDs

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Apply KVM's hotplug hack if and only if userspace has enabled 32-bit IDs
> for x2APIC.  If 32-bit IDs are not enabled, disable the optimized map to
> honor x86 architectural behavior if multiple vCPUs shared a physical APIC
> ID.  As called out in the changelog that added the hack, all CPUs whose
> (possibly truncated) APIC ID matches the target are supposed to receive
> the IPI.
> 
>   KVM intentionally differs from real hardware, because real hardware
>   (Knights Landing) does just "x2apic_id & 0xff" to decide whether to
>   accept the interrupt in xAPIC mode and it can deliver one interrupt to
>   more than one physical destination, e.g. 0x123 to 0x123 and 0x23.
> 
> Applying the hack even when x2APIC is not fully enabled means KVM doesn't
> correctly handle scenarios where the guest has aliased xAPIC IDs across
> multiple vCPUs, as only the vCPU with the lowest vCPU ID will receive any
> interrupts.  It's extremely unlikely any real world guest aliase APIC IDs,
> or even modifies APIC IDs, but KVM's behavior is arbitrary, e.g. the
> lowest vCPU ID "wins" regardless of which vCPU is "aliasing" and which
> vCPU is "normal".
> 
> Furthermore, the hack is _not_ guaranteed to work!  The hack works if and
> only if the optimized APIC map is successfully allocated.  If the map
> allocation fails (unlikely), KVM will fall back to its unoptimized
> behavior, which _does_ honor the architectural behavior.
> 
> Pivot on 32-bit x2APIC IDs being enabled as that is required to take
> advantage of the hotplug hack (see kvm_apic_state_fixup()), i.e. won't
> break existing setups unless they are way, way off in the weeds.
> 
> And an entry in KVM's errata to document the hack.  Alternatively, KVM
> could provide an actual x2APIC quirk and document the hack that way, but
> there's unlikely to ever be a use case for disabling the quirk.  Go the
> errata route to avoid having to validate a quirk no one cares about.
> 
> Fixes: 5bd5db385b3e ("KVM: x86: allow hotplug of VCPU with APIC ID over 0xff")
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  Documentation/virt/kvm/x86/errata.rst | 11 ++++++
>  arch/x86/kvm/lapic.c                  | 50 ++++++++++++++++++++++-----
>  2 files changed, 52 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/errata.rst b/Documentation/virt/kvm/x86/errata.rst
> index 410e0aa63493..49a05f24747b 100644
> --- a/Documentation/virt/kvm/x86/errata.rst
> +++ b/Documentation/virt/kvm/x86/errata.rst
> @@ -37,3 +37,14 @@ Nested virtualization features
>  ------------------------------
>  
>  TBD
> +
> +x2APIC
> +------
> +When KVM_X2APIC_API_USE_32BIT_IDS is enabled, KVM activates a hack/quirk that
> +allows sending events to a single vCPU using its x2APIC ID even if the target
> +vCPU has legacy xAPIC enabled, e.g. to bring up hotplugged vCPUs via INIT-SIPI
> +on VMs with > 255 vCPUs.  A side effect of the quirk is that, if multiple vCPUs
> +have the same physical APIC ID, KVM will deliver events targeting that APIC ID
> +only to the vCPU with the lowest vCPU ID.  If KVM_X2APIC_API_USE_32BIT_IDS is
> +not enabled, KVM follows x86 architecture when processing interrupts (all vCPUs
> +matching the target APIC ID receive the interrupt).
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index d537b51295d6..c224b5c7cd92 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -260,10 +260,10 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		struct kvm_lapic *apic = vcpu->arch.apic;
>  		struct kvm_lapic **cluster;
> +		u32 x2apic_id, physical_id;
>  		u16 mask;
>  		u32 ldr;
>  		u8 xapic_id;
> -		u32 x2apic_id;
>  
>  		if (!kvm_apic_present(vcpu))
>  			continue;
> @@ -271,16 +271,48 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		xapic_id = kvm_xapic_id(apic);
>  		x2apic_id = kvm_x2apic_id(apic);
>  
> -		/* Hotplug hack: see kvm_apic_match_physical_addr(), ... */
> -		if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> -				x2apic_id <= new->max_apic_id)
> -			new->phys_map[x2apic_id] = apic;
>  		/*
> -		 * ... xAPIC ID of VCPUs with APIC ID > 0xff will wrap-around,
> -		 * prevent them from masking VCPUs with APIC ID <= 0xff.
> +		 * Apply KVM's hotplug hack if userspace has enable 32-bit APIC
> +		 * IDs.  Allow sending events to vCPUs by their x2APIC ID even
> +		 * if the target vCPU is in legacy xAPIC mode, and silently
> +		 * ignore aliased xAPIC IDs (the x2APIC ID is truncated to 8
> +		 * bits, causing IDs > 0xff to wrap and collide).
> +		 *
> +		 * Honor the architectural (and KVM's non-optimized) behavior
> +		 * if userspace has not enabled 32-bit x2APIC IDs.  Each APIC
> +		 * is supposed to process messages independently.  If multiple
> +		 * vCPUs have the same effective APIC ID, e.g. due to the
> +		 * x2APIC wrap or because the guest manually modified its xAPIC
> +		 * IDs, events targeting that ID are supposed to be recognized
> +		 * by all vCPUs with said ID.
>  		 */
> -		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> -			new->phys_map[xapic_id] = apic;
> +		if (kvm->arch.x2apic_format) {
> +			/* See also kvm_apic_match_physical_addr(). */
> +			if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> +			    x2apic_id <= new->max_apic_id)
> +				new->phys_map[x2apic_id] = apic;
> +
> +			if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> +				new->phys_map[xapic_id] = apic;
> +		} else {
> +			/*
> +			 * Disable the optimized map if the physical APIC ID is
> +			 * already mapped, i.e. is aliased to multiple vCPUs.
> +			 * The optimized map requires a strict 1:1 mapping
> +			 * between IDs and vCPUs.
> +			 */
> +			if (apic_x2apic_mode(apic))
> +				physical_id = x2apic_id;
> +			else
> +				physical_id = xapic_id;
> +
> +			if (new->phys_map[physical_id]) {
> +				kvfree(new);
> +				new = NULL;
> +				goto out;
Why not to use the same  KVM_APIC_MODE_XAPIC_FLAT |  KVM_APIC_MODE_XAPIC_CLUSTER
hack here?

Other than that looks correct but I don't know this area well enough to be 100% sure.

Best regards,
	Maxim Levitsky


> +			}
> +			new->phys_map[physical_id] = apic;
> +		}
>  
>  		if (!kvm_apic_sw_enabled(apic))
>  			continue;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ