lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <42513cb3-3c2e-4aa8-b748-23b6656a5096@redhat.com>
Date: Mon, 22 Dec 2025 15:09:13 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Ankit Soni <Ankit.Soni@....com>, Sean Christopherson <seanjc@...gle.com>,
 Marc Zyngier <maz@...nel.org>
Cc: Oliver Upton <oliver.upton@...ux.dev>, Joerg Roedel <joro@...tes.org>,
 David Woodhouse <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>,
 linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
 kvm@...r.kernel.org, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
 Sairaj Kodilkar <sarunkod@....com>, Vasant Hegde <vasant.hegde@....com>,
 Maxim Levitsky <mlevitsk@...hat.com>,
 Joao Martins <joao.m.martins@...cle.com>,
 Francesco Lavra <francescolavra.fl@...il.com>,
 David Matlack <dmatlack@...gle.com>, Naveen Rao <Naveen.Rao@....com>
Subject: possible deadlock due to irq_set_thread_affinity() calling into the
 scheduler (was Re: [PATCH v3 38/62] KVM: SVM: Take and hold ir_list_lock
 across IRTE updates in IOMMU)

On 12/22/25 10:16, Ankit Soni wrote:
>    ======================================================
>    WARNING: possible circular locking dependency detected
>    6.19.0-rc2 #20 Tainted: G            E
>    ------------------------------------------------------
>    CPU 58/KVM/28597 is trying to acquire lock:
>      ff12c47d4b1f34c0 (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0
> 
>      but task is already holding lock:
>      ff12c49b28552110 (&svm->ir_list_lock){....}-{2:2}, at: avic_pi_update_irte+0x147/0x270 [kvm_amd]
> 
>      which lock already depends on the new lock.
> 
>    Chain exists of:
>      &irq_desc_lock_class --> &rq->__lock --> &svm->ir_list_lock
> 
>    Possible unsafe locking scenario:
> 
>          CPU0                            CPU1
>          ----                            ----
>     lock(&svm->ir_list_lock);
>                                        lock(&rq->__lock);
>                                        lock(&svm->ir_list_lock);
>     lock(&irq_desc_lock_class);
> 
>          *** DEADLOCK ***
> 
> So lockdep sees:
> 
>    &irq_desc_lock_class -> &rq->__lock -> &svm->ir_list_lock
> 
> while avic_pi_update_irte() currently holds svm->ir_list_lock and then
> takes irq_desc_lock via irq_set_vcpu_affinity(), which creates the
> potential inversion.
> 
>    - Is this lockdep warning expected/benign in this code path, or does it
>      indicate a real potential deadlock between svm->ir_list_lock and
>      irq_desc_lock with AVIC + irq_bypass + VFIO?

I'd treat it as a potential (if unlikely) deadlock:

(a) irq_set_thread_affinity triggers the scheduler via wake_up_process,
while irq_desc->lock is taken

(b) the scheduler calls into KVM with rq_lock taken, and KVM uses
ir_list_lock within __avic_vcpu_load/__avic_vcpu_put

(c) KVM wants to block scheduling for a while and uses ir_list_lock for
that purpose, but then takes irq_set_vcpu_affinity takes irq_desc->lock.

I don't think there's an alternative choice of lock for (c); and there's
no easy way to pull the irq_desc->lock out of the IRQ subsystem--in fact
the stickiness of the situation comes from rq->rq_lock and
irq_desc->lock being both internal and not leaf.

Of the three, the most sketchy is (a); notably, __setup_irq() calls
wake_up_process outside desc->lock.  Therefore I'd like so much to treat
it as a kernel/irq/ bug; and the simplest (perhaps too simple...) fix is
to drop the wake_up_process().  The only cost is extra latency on the
next interrupt after an affinity change.

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 8b1b4c8a4f54..fc135bd079a4 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -189,14 +189,10 @@ static void irq_set_thread_affinity(struct irq_desc *desc)
  	struct irqaction *action;
  
  	for_each_action_of_desc(desc, action) {
-		if (action->thread) {
+		if (action->thread)
  			set_bit(IRQTF_AFFINITY, &action->thread_flags);
-			wake_up_process(action->thread);
-		}
-		if (action->secondary && action->secondary->thread) {
+		if (action->secondary && action->secondary->thread)
  			set_bit(IRQTF_AFFINITY, &action->secondary->thread_flags);
-			wake_up_process(action->secondary->thread);
-		}
  	}
  }
  
Marc, what do you think?

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ