lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <79be8c4ffb506bbf9fdf3f69ac8f24edacbeaf35.camel@intel.com>
Date: Wed, 28 Jan 2026 21:10:06 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "dwmw2@...radead.org" <dwmw2@...radead.org>, "khushit.shah@...anix.com"
	<khushit.shah@...anix.com>, "bp@...en8.de" <bp@...en8.de>, "x86@...nel.org"
	<x86@...nel.org>, "tglx@...utronix.de" <tglx@...utronix.de>, "hpa@...or.com"
	<hpa@...or.com>, "Kohler, Jon" <jon@...anix.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"mingo@...hat.com" <mingo@...hat.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "stable@...r.kernel.org" <stable@...r.kernel.org>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "shaju.abraham@...anix.com"
	<shaju.abraham@...anix.com>
Subject: Re: [PATCH v6] KVM: x86: Add x2APIC "features" to control EOI
 broadcast suppression

On Wed, 2026-01-28 at 06:57 -0800, Sean Christopherson wrote:
> On Wed, Jan 28, 2026, Kai Huang wrote:
> > On Tue, 2026-01-27 at 19:48 -0800, David Woodhouse wrote:
> > > On Wed, 2026-01-28 at 02:22 +0000, Huang, Kai wrote:
> > > >  
> > > > > Ah, so userspace which checks all the kernel's capabilities *first*
> > > > > will not see KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST advertised,
> > > > > because it needs to enable KVM_CAP_SPLIT_IRQCHIP first?
> > > > > 
> > > > > I guess that's tolerable¹ but the documentation could make it clearer,
> > > > > perhaps? I can see VMMs silently failing to detect the feature because
> > > > > they just don't set split-irqchip before checking for it? 
> > > > > 
> > > > > 
> > > > > ¹ although I still kind of hate it and would have preferred to have the
> > > > >    I/O APIC patch; userspace still has to intentionally *enable* that
> > > > >    combination. But OK, I've reluctantly conceded that.
> > > > 
> > > > To make it even more robust, perhaps we can grab kvm->lock mutex in
> > > > kvm_vm_ioctl_enable_cap() for KVM_CAP_X2APIC_API, so that it won't race with
> > > > KVM_CREATE_IRQCHIP (which already grabs kvm->lock) and
> > > > KVM_CAP_SPLIT_IRQCHIP?
> > > > 
> > > > Even more, we can add additional check in KVM_CREATE_IRQCHIP to return -
> > > > EINVAL when it sees kvm->arch.suppress_eoi_broadcast_mode is
> > > > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST?
> > > 
> > > If we do that, then the query for KVM_CAP_X2APIC_API could advertise
> > > the KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST for a freshly created KVM,
> > > even before userspace has enabled *either* KVM_CREATE_IRQCHIP nor
> > > KVM_CAP_SPLIT_IRQCHIP?
> > 
> > No IIUC it doesn't change that?
> > 
> > The change I mentioned above is only related to "enable" part, but not
> > "query" part.
> > 
> > The "query" is done via kvm_vm_ioctl_check_extension(KVM_CAP_X2APIC_API),
> > and in this patch, it does:
> > 
> > @@ -4931,6 +4933,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long
> > ext)
> >  		break;
> >  	case KVM_CAP_X2APIC_API:
> >  		r = KVM_X2APIC_API_VALID_FLAGS;
> > +		if (kvm && !irqchip_split(kvm))
> > +			r &= ~KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST;
> > 
> > IIRC if this is called before KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP,
> > then !irqchip_split() will be true, so it will NOT advertise
> > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.
> > 
> > If it is called after KVM_CAP_SPLIT_IRQCHIP, then it will advertise
> > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.
> 
> Yep.  And when called at system-scope, i.e. with @kvm=NULL, userspace will see
> the maximal support with KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.

Yep.

> 
> > Btw, it doesn't grab kvm->lock either, so theoretically it could race with
> > KVM_CREATE_IRQCHIP and kvm_vm_ioctl_enable_cap(KVM_CAP_SPLIT_IRQCHIP) too.
> 
> That's totally fine.
> 
> > > That would be slightly better than the existing proposed awfulness
> > > where the kernel doesn't *admit* to having the _ENABLE_ capability
> > > until userspace first enables the KVM_CAP_SPLIT_IRQCHIP.
> > 
> > We could also make kvm_vm_ioctl_check_extension(KVM_CAP_X2APIC_API) to
> > _always_ advertise KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST if that's
> > better.
> 
> No, because then we'd need new uAPI if we add support for ENABLE_SUPPRESS_EOI_BROADCAST
> with an in-kernel I/O APIC.

That's my concern too (wasn't quite sure about that, though).

I thought we could document in-kernel IOAPIC doesn't work with
ENABLE_SUPPRESS_EOI_BROADCAST for now but we may support it in the future.

> 
> > I suppose what we need is to document such behaviour -- that albeit 
> > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST is advertise as supposed, but it
> > cannot be enabled together with KVM_CREATE_IRQCHIP -- one will fail
> > depending on which is called first.
> 
> No, we don't need to explicitly document this, because it's super duper basic
> multi-threaded programming.  KVM only needs to documented that
> KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST requires a VM with KVM_CAP_SPLIT_IRQCHIP.
> 
> > As a bonus, it can get rid of "calling irqchip_split() w/o holding kvm-
> > > lock" awfulness too.
> 
> No, it's not awfulness.  It's userspace's responsibility to not be stupid.  KVM
> taking kvm->lock changes *nothing*.  
> 

Right it doesn't change any result.

> All holding kvm->lock does is serialize KVM
> code, it doesn't prevent a race.  I.e. it just changes whether tasks are racing
> to acquire kvm->lock versus racing against irqchip_mode.
> 
> If userspace invokes KVM_CAP_SPLIT_IRQCHIP and KVM_ENABLE_CAP concurrently on two
> separate tasks, then KVM_ENABLE_CAP will fail ~50% of the time regardless of
> whether or not KVM takes kvm->lock.
> 

Fair enough.  Thanks for the clarification :-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ