[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <280dd506-e1fc-4d2e-bdc4-98dd9dca6138@grenoble.cnrs.fr>
Date: Thu, 14 Aug 2025 14:59:42 +0200
From: Edgar Bonet <bonet@...noble.cnrs.fr>
To: Thomas Gleixner <tglx@...utronix.de>,
Nicolas Ferre <nicolas.ferre@...rochip.com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
Claudiu Beznea <claudiu.beznea@...on.dev>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Geert Uytterhoeven <geert+renesas@...der.be>
Subject: [ISSUE + PATCH] Interrupts were enabled early by spinlock guard
Hello everybody!
I am facing an "Interrupts were enabled early" kernel warning which, as
far as I can tell, is caused by a spinlock guard in an ARM/Microchip
IRQCHIP driver. I think I solved the issue, and I am proposing a patch
below, after the scissor. But I must first disclose that:
* I am completely new to Linux internals and its development process.
This is why I chose to err on the side of providing too much
information on my issue. It is not unlikely that I am doing
something very wrong.
* I am not subscribed to either of the linux-{,arm-}kernel mailing
lists
* I will be far from the Internet in the few days to come. I should be
connected an responsive starting from 2025-08-24.
## Context
I am playing with an Acmesystems Acqua[1] system on module, which is
based on a SAMA5D31 SoC (single core Cortex-A5). I maintain the
Buildroot defconfig for this board,[2] which is currently based on a
vanilla Linux kernel v6.12.41.
As I wanted to check that the board runs fine on newer kernels, I built
a v6.16 for it using gcc 14, binutils 2.43, the in-tree sama5_defconfig
merged with this fragment:
# CONFIG_BRIDGE is not set
# CONFIG_MODULES is not set
# CONFIG_NET_DSA is not set
# CONFIG_WIRELESS is not set
# CONFIG_USB_NET_DRIVERS is not set
# CONFIG_WLAN is not set
# CONFIG_MEDIA_SUPPORT is not set
# CONFIG_SOUND is not set
# CONFIG_AUTOFS_FS is not set
and the Buildroot-provided device tree, patched for compatibility with
Linux commit 510a6190cf5e ("ARM: dts: microchip: fix faulty ohci/ehci
node names").
The defconfig fragment above is meant to remove module support (with
some unused drivers along the way), which makes testing easier for me.
## Issue
While booting, Linux v6.16 printed this message on the serial console:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at init/main.c:1024 start_kernel+0x4d0/0x5dc
Interrupts were enabled early
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.16.0 #1 NONE
Hardware name: Atmel SAMA5
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x38/0x48
dump_stack_lvl from __warn+0x78/0xd4
__warn from warn_slowpath_fmt+0x98/0xc8
warn_slowpath_fmt from start_kernel+0x4d0/0x5dc
start_kernel from 0x0
---[ end trace 0000000000000000 ]---
The board seemed to work fine, so maybe this warning is completely
harmless. It looked, however, scary enough to deserve some
investigation.
## Bug hunting
I could not reproduce the issue on a qemu virtual machine. All tests
were then done on the real hardware.
I looked for the Linux commit that introduced the issue. ‘git bisect’
told me this was 195298c3b116 ("genirq/generic-chip: Convert core code
to lock guards"). I then checked out a recent mainline kernel, namely
v6.17-rc1, and tried to revert that commit. For this, I first had to
revert two follow-up commits, namely 7ae844a6650c ("genirq/generic-chip:
Remove unused lock wrappers") and 771487050f83 ("genirq/generic-chip:
Fix incorrect lock guard conversions"). This was unsuccessful: I still
had the warning.
I went back to a clean v6.17-rc1 and tried to find the thing that was
enabling interrupts early. After lots of printk() debugging, I
discovered it was a guard(raw_spinlock_irq) destructor. The call chain
is this (line numbers are for v6.17-rc1):
start_kernel (init/main.c:1011)
-> time_init (arch/arm/kernel/time.c:96)
-> timer_probe (drivers/clocksource/timer-probe.c:30)
=> tcb_clksrc_init (drivers/clocksource/timer-atmel-tcb.c:413)
-> of_irq_get (drivers/of/irq.c:474)
-> irq_create_of_mapping (kernel/irq/irqdomain.c:980)
-> irq_create_fwspec_mapping (kernel/irq/irqdomain.c:848)
=> aic5_irq_domain_xlate (drivers/irqchip/irq-atmel-aic5.c:287)
-> guard destructor (raw_spin_unlock_irq?)
where (=>) is an indirect call through a function pointer.
## Tentative fix
Commit 771487050f83 gave me the inspiration. The guard in question was
introduced by b00bee8afaca ("irqchip: Convert generic irqchip locking to
guards"), which replaced calls to irq_gc_lock_irq{save,restore}() by
guard(raw_spinlock_irq) (with no “save” in the name). The commit log
states that this is “intended and correct”, but I could not make sense
of the explanation. My (possibly faulty) understanding is that the guard
constructor disables interrupts, and the destructor either
unconditionally enables them (raw_spinlock_irq), or restores the
previous interrupt state (raw_spinlock_irqsave).
I then replaced guard(raw_spinlock_irq) with guard(raw_spinlock_irqsave)
and that seemed to do the job: the warning is gone. See the patch below
the scissors.
Best regards, and thank-you for reading so far.
Edgar Bonet.
[1] https://www.acmesystems.it/acqua
[2] https://gitlab.com/buildroot.org/buildroot/-/blob/2025.08-rc1/configs/acmesystems_acqua_a5_512mb_defconfig
------------------------------------------------------------------ >8 --
Subject: [PATCH] irqchip/atmel-aic5: Fix incorrect lock guard conversion
Commit b00bee8afaca ("irqchip: Convert generic irqchip locking to guards")
replaced calls to irq_gc_lock_irq{save,restore}() with
guard(raw_spinlock_irq). However, in irq-atmel-aic5.c, one such guard is
created early in the boot process, before interrupts are initially enabled.
As its destructor enables interrupts, this results in the following warning
on a SAMA5D31-based system:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at init/main.c:1024 start_kernel+0x4d0/0x5dc
Interrupts were enabled early
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.16.0 #1 NONE
Hardware name: Atmel SAMA5
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x38/0x48
dump_stack_lvl from __warn+0x78/0xd4
__warn from warn_slowpath_fmt+0x98/0xc8
warn_slowpath_fmt from start_kernel+0x4d0/0x5dc
start_kernel from 0x0
---[ end trace 0000000000000000 ]---
Fix this by using guard(raw_spinlock_irqsave) instead.
Fixes: b00bee8afaca ("irqchip: Convert generic irqchip locking to guards")
Signed-off-by: Edgar Bonet <bonet@...noble.cnrs.fr>
---
drivers/irqchip/irq-atmel-aic5.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-atmel-aic5.c b/drivers/irqchip/irq-atmel-aic5.c
index 60b00d2c3d7a..1f14b401f71d 100644
--- a/drivers/irqchip/irq-atmel-aic5.c
+++ b/drivers/irqchip/irq-atmel-aic5.c
@@ -279,7 +279,7 @@ static int aic5_irq_domain_xlate(struct irq_domain *d,
if (ret)
return ret;
- guard(raw_spinlock_irq)(&bgc->lock);
+ guard(raw_spinlock_irqsave)(&bgc->lock);
irq_reg_writel(bgc, *out_hwirq, AT91_AIC5_SSR);
smr = irq_reg_readl(bgc, AT91_AIC5_SMR);
aic_common_set_priority(intspec[2], &smr);
--
2.43.0
Powered by blists - more mailing lists