[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260107215353.75612-1-longman@redhat.com>
Date: Wed, 7 Jan 2026 16:53:53 -0500
From: Waiman Long <longman@...hat.com>
To: Marc Zyngier <maz@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>
Cc: linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev,
Waiman Long <longman@...hat.com>
Subject: [PATCH] irqchip/gic-v3-its: Don't acquire rt_spin_lock in allocate_vpe_l1_table()
When running a PREEMPT_RT debug kernel on a 2-socket Grace arm64 system,
the following bug report was produced at bootup time.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/72
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
:
CPU: 72 UID: 0 PID: 0 Comm: swapper/72 Tainted: G W 6.19.0-rc4-test+ #4 PREEMPT_{RT,(full)}
Tainted: [W]=WARN
Call trace:
:
rt_spin_lock+0xe4/0x408
rmqueue_bulk+0x48/0x1de8
__rmqueue_pcplist+0x410/0x650
rmqueue.constprop.0+0x6a8/0x2b50
get_page_from_freelist+0x3c0/0xe68
__alloc_frozen_pages_noprof+0x1dc/0x348
alloc_pages_mpol+0xe4/0x2f8
alloc_frozen_pages_noprof+0x124/0x190
allocate_slab+0x2f0/0x438
new_slab+0x4c/0x80
___slab_alloc+0x410/0x798
__slab_alloc.constprop.0+0x88/0x1e0
__kmalloc_cache_noprof+0x2dc/0x4b0
allocate_vpe_l1_table+0x114/0x788
its_cpu_init_lpis+0x344/0x790
its_cpu_init+0x60/0x220
gic_starting_cpu+0x64/0xe8
cpuhp_invoke_callback+0x438/0x6d8
__cpuhp_invoke_callback_range+0xd8/0x1f8
notify_cpu_starting+0x11c/0x178
secondary_start_kernel+0xc8/0x188
__secondary_switched+0xc0/0xc8
This is due to the fact that allocate_vpe_l1_table() will call
kzalloc() to allocate a cpumask_t when the first CPU of the
second node of the 72-cpu Grace system is being called from the
CPUHP_AP_MIPS_GIC_TIMER_STARTING state inside the starting section of
the CPU hotplug bringup pipeline where interrupt is disabled. This is an
atomic context where sleeping is not allowed and acquiring a sleeping
rt_spin_lock within kzalloc() may lead to system hang in case there is
a lock contention.
To work around this issue, a static buffer is used for cpumask
allocation when running a PREEMPT_RT kernel via the newly introduced
vpe_alloc_cpumask() helper. The static buffer is currently set to be
4 kbytes in size. As only one cpumask is needed per node, the current
size should be big enough as long as (cpumask_size() * nr_node_ids)
is not bigger than 4k.
Signed-off-by: Waiman Long <longman@...hat.com>
---
drivers/irqchip/irq-gic-v3-its.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ada585bfa451..9185785524dc 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2896,6 +2896,30 @@ static bool allocate_vpe_l2_table(int cpu, u32 id)
return true;
}
+static void *vpe_alloc_cpumask(void)
+{
+ /*
+ * With PREEMPT_RT kernel, we can't call any k*alloc() APIs as they
+ * may acquire a sleeping rt_spin_lock in an atomic context. So use
+ * a pre-allocated buffer instead.
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+ static unsigned long mask_buf[512];
+ static atomic_t alloc_idx;
+ int idx, mask_size = cpumask_size();
+ int nr_cpumasks = sizeof(mask_buf)/mask_size;
+
+ /*
+ * Fetch an allocation index and if it points to a buffer within
+ * mask_buf[], return that. Fall back to kzalloc() otherwise.
+ */
+ idx = atomic_fetch_inc(&alloc_idx);
+ if (idx < nr_cpumasks)
+ return &mask_buf[idx * mask_size/sizeof(long)];
+ }
+ return kzalloc(sizeof(cpumask_t), GFP_ATOMIC);
+}
+
static int allocate_vpe_l1_table(void)
{
void __iomem *vlpi_base = gic_data_rdist_vlpi_base();
@@ -2927,7 +2951,7 @@ static int allocate_vpe_l1_table(void)
if (val & GICR_VPROPBASER_4_1_VALID)
goto out;
- gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t), GFP_ATOMIC);
+ gic_data_rdist()->vpe_table_mask = vpe_alloc_cpumask();
if (!gic_data_rdist()->vpe_table_mask)
return -ENOMEM;
--
2.52.0
Powered by blists - more mailing lists