[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240826080618.3886694-1-maz@kernel.org>
Date: Mon, 26 Aug 2024 09:06:18 +0100
From: Marc Zyngier <maz@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: Kunkun Jiang <jiangkunkun@...wei.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: [PATCH] genirq: Get rid of global lock in irq_do_set_affinity()
Kunkun Jiang reports that for a workload involving the simultaneous
startup of a large number of VMs (for a total of about 200 vcpus),
a lot of CPU time gets spent on spinning on the tmp_mask_lock that
exists as a static raw spinlock in irq_do_set_affinity(). This lock
protects a global cpumask (tmp_mask) that is used as a temporary
variable to compute the resulting affinity.
While this is triggered by KVM issuing a irq_set_affinity() call
each time a vcpu is about to execute, it is obvious that having
a single global resource is not very scalable, and that we could
do something better.
Since a cpumask can be a fairly large structure on systems with
a high core count, a stack allocation is not really appropriate.
Instead, turn the global cpumask into a per-CPU variable, removing
the need for locking altogether as we are not preemptible at this
point.
Reported-by: Kunkun Jiang <jiangkunkun@...wei.com>
Suggested-by: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Marc Zyngier <maz@...nel.org>
---
kernel/irq/manage.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index dd53298ef1a5..b6aa259ac749 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -224,15 +224,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
struct irq_desc *desc = irq_data_to_desc(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
const struct cpumask *prog_mask;
+ struct cpumask *tmp_mask;
int ret;
- static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
- static struct cpumask tmp_mask;
+ static DEFINE_PER_CPU(struct cpumask, __tmp_mask);
if (!chip || !chip->irq_set_affinity)
return -EINVAL;
- raw_spin_lock(&tmp_mask_lock);
+ tmp_mask = this_cpu_ptr(&__tmp_mask);
+
/*
* If this is a managed interrupt and housekeeping is enabled on
* it check whether the requested affinity mask intersects with
@@ -258,11 +259,11 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
- cpumask_and(&tmp_mask, mask, hk_mask);
- if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
+ cpumask_and(tmp_mask, mask, hk_mask);
+ if (!cpumask_intersects(tmp_mask, cpu_online_mask))
prog_mask = mask;
else
- prog_mask = &tmp_mask;
+ prog_mask = tmp_mask;
} else {
prog_mask = mask;
}
@@ -272,16 +273,14 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
* unless we are being asked to force the affinity (in which
* case we do as we are told).
*/
- cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
- if (!force && !cpumask_empty(&tmp_mask))
- ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ cpumask_and(tmp_mask, prog_mask, cpu_online_mask);
+ if (!force && !cpumask_empty(tmp_mask))
+ ret = chip->irq_set_affinity(data, tmp_mask, force);
else if (force)
ret = chip->irq_set_affinity(data, mask, force);
else
ret = -EINVAL;
- raw_spin_unlock(&tmp_mask_lock);
-
switch (ret) {
case IRQ_SET_MASK_OK:
case IRQ_SET_MASK_OK_DONE:
--
2.39.2
Powered by blists - more mailing lists