[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2026021435-CVE-2026-23198-8a25@gregkh>
Date: Sat, 14 Feb 2026 17:28:55 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...nel.org>
Subject: CVE-2026-23198: KVM: Don't clobber irqfd routing type when deassigning irqfd
From: Greg Kroah-Hartman <gregkh@...nel.org>
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
KVM: Don't clobber irqfd routing type when deassigning irqfd
When deassigning a KVM_IRQFD, don't clobber the irqfd's copy of the IRQ's
routing entry as doing so breaks kvm_arch_irq_bypass_del_producer() on x86
and arm64, which explicitly look for KVM_IRQ_ROUTING_MSI. Instead, to
handle a concurrent routing update, verify that the irqfd is still active
before consuming the routing information. As evidenced by the x86 and
arm64 bugs, and another bug in kvm_arch_update_irqfd_routing() (see below),
clobbering the entry type without notifying arch code is surprising and
error prone.
As a bonus, checking that the irqfd is active provides a convenient
location for documenting _why_ KVM must not consume the routing entry for
an irqfd that is in the process of being deassigned: once the irqfd is
deleted from the list (which happens *before* the eventfd is detached), it
will no longer receive updates via kvm_irq_routing_update(), and so KVM
could deliver an event using stale routing information (relative to
KVM_SET_GSI_ROUTING returning to userspace).
As an even better bonus, explicitly checking for the irqfd being active
fixes a similar bug to the one the clobbering is trying to prevent: if an
irqfd is deactivated, and then its routing is changed,
kvm_irq_routing_update() won't invoke kvm_arch_update_irqfd_routing()
(because the irqfd isn't in the list). And so if the irqfd is in bypass
mode, IRQs will continue to be posted using the old routing information.
As for kvm_arch_irq_bypass_del_producer(), clobbering the routing type
results in KVM incorrectly keeping the IRQ in bypass mode, which is
especially problematic on AMD as KVM tracks IRQs that are being posted to
a vCPU in a list whose lifetime is tied to the irqfd.
Without the help of KASAN to detect use-after-free, the most common
sympton on AMD is a NULL pointer deref in amd_iommu_update_ga() due to
the memory for irqfd structure being re-allocated and zeroed, resulting
in irqfd->irq_bypass_data being NULL when read by
avic_update_iommu_vcpu_affinity():
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 40cf2b9067 P4D 40cf2b9067 PUD 408362a067 PMD 0
Oops: Oops: 0000 [#1] SMP
CPU: 6 UID: 0 PID: 40383 Comm: vfio_irq_test
Tainted: G U W O 6.19.0-smp--5dddc257e6b2-irqfd #31 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025
RIP: 0010:amd_iommu_update_ga+0x19/0xe0
Call Trace:
<TASK>
avic_update_iommu_vcpu_affinity+0x3d/0x90 [kvm_amd]
__avic_vcpu_load+0xf4/0x130 [kvm_amd]
kvm_arch_vcpu_load+0x89/0x210 [kvm]
vcpu_load+0x30/0x40 [kvm]
kvm_arch_vcpu_ioctl_run+0x45/0x620 [kvm]
kvm_vcpu_ioctl+0x571/0x6a0 [kvm]
__se_sys_ioctl+0x6d/0xb0
do_syscall_64+0x6f/0x9d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x46893b
</TASK>
---[ end trace 0000000000000000 ]---
If AVIC is inhibited when the irfd is deassigned, the bug will manifest as
list corruption, e.g. on the next irqfd assignment.
list_add corruption. next->prev should be prev (ffff8d474d5cd588),
but was 0000000000000000. (next=ffff8d8658f86530).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:31!
Oops: invalid opcode: 0000 [#1] SMP
CPU: 128 UID: 0 PID: 80818 Comm: vfio_irq_test
Tainted: G U W O 6.19.0-smp--f19dc4d680ba-irqfd #28 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025
RIP: 0010:__list_add_valid_or_report+0x97/0xc0
Call Trace:
<TASK>
avic_pi_update_irte+0x28e/0x2b0 [kvm_amd]
kvm_pi_update_irte+0xbf/0x190 [kvm]
kvm_arch_irq_bypass_add_producer+0x72/0x90 [kvm]
irq_bypass_register_consumer+0xcd/0x170 [irqbypass]
kvm_irqfd+0x4c6/0x540 [kvm]
kvm_vm_ioctl+0x118/0x5d0 [kvm]
__se_sys_ioctl+0x6d/0xb0
do_syscall_64+0x6f/0x9d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
On Intel and arm64, the bug is less noisy, as the end result is that the
device keeps posting IRQs to the vCPU even after it's been deassigned.
Note, the worst of the breakage can be traced back to commit cb210737675e
("KVM: Pass new routing entries and irqfd when updating IRTEs"), as before
that commit KVM would pull the routing information from the per-VM routing
table. But as above, similar bugs have existed since support for IRQ
bypass was added. E.g. if a routing change finished before irq_shutdown()
invoked kvm_arch_irq_bypass_del_producer(), VMX and SVM would see stale
routing information and potentially leave the irqfd in bypass mode.
Alternatively, x86 could be fixed by explicitly checking irq_bypass_vcpu
instead of irq_entry.type in kvm_arch_irq_bypass_del_producer(), and arm64
could be modified to utilize irq_bypass_vcpu in a similar manner. But (a)
that wouldn't fix the routing updates bug, and (b) fixing core code doesn't
preclude x86 (or arm64) from adding such code as a sanity check (spoiler
alert).
The Linux kernel CVE team has assigned CVE-2026-23198 to this issue.
Affected and fixed versions
===========================
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 5.10.250 with commit 959a063e7f12524bc1871ad1f519787967bbcd45
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 5.15.200 with commit 2284bc168b148a17b5ca3b37b3d95c411f18a08d
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 6.1.163 with commit 6d14ba1e144e796b5fc81044f08cfba9024ca195
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 6.6.124 with commit b61f9b2fcf181451d0a319889478cc53c001123e
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 6.12.70 with commit ff48c9312d042bfbe826ca675e98acc6c623211c
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 6.18.10 with commit 4385b2f2843549bfb932e0dcf76bf4b065543a3c
Issue introduced in 4.4 with commit f70c20aaf141adb715a2d750c55154073b02a9c3 and fixed in 6.19 with commit b4d37cdb77a0015f51fee083598fa227cc07aaf1
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2026-23198
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
virt/kvm/eventfd.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/959a063e7f12524bc1871ad1f519787967bbcd45
https://git.kernel.org/stable/c/2284bc168b148a17b5ca3b37b3d95c411f18a08d
https://git.kernel.org/stable/c/6d14ba1e144e796b5fc81044f08cfba9024ca195
https://git.kernel.org/stable/c/b61f9b2fcf181451d0a319889478cc53c001123e
https://git.kernel.org/stable/c/ff48c9312d042bfbe826ca675e98acc6c623211c
https://git.kernel.org/stable/c/4385b2f2843549bfb932e0dcf76bf4b065543a3c
https://git.kernel.org/stable/c/b4d37cdb77a0015f51fee083598fa227cc07aaf1
Powered by blists - more mailing lists