[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87bjifavyu.ffs@tglx>
Date: Tue, 27 Jan 2026 16:58:33 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Yangyu Chen <cyy@...self.name>, linux-riscv@...ts.infradead.org
Cc: linux-kernel@...r.kernel.org, Anup Patel <anup.patel@....qualcomm.com>,
Samuel Holland <samuel.holland@...ive.com>, Charles Mirabile
<cmirabil@...hat.com>, Lucas Zampieri <lzampier@...hat.com>, Paul Walmsley
<pjw@...nel.org>, Palmer Dabbelt <palmer@...belt.com>, Mason Huo
<mason.huo@...rfivetech.com>, Zhang Xincheng
<zhangxincheng@...rarisc.com>, Charlie Jenkins <charlie@...osinc.com>,
Marc Zyngier <maz@...nel.org>, Sia Jee Heng
<jeeheng.sia@...rfivetech.com>, Ley Foon Tan
<leyfoon.tan@...rfivetech.com>, Yangyu Chen <cyy@...self.name>
Subject: Re: [PATCH v2] irqchip/sifive-plic: Fix insufficient irq_groups
allocation
On Wed, Jan 21 2026 at 13:45, Yangyu Chen wrote:
> Since the first irq source is 1 instead of 0, when the number of
> irqs is multiple of 32, the last irq group will be ignored during
> allocation, saving, and restoring. This lead to memory corruption
> when accessing enable_save beyond allocated memory after commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
> which will access enable_save for all sources during plic_probe.
> Thus, we should allocate irq_groups based on (nr_irqs + 1) instead of
> nr_irqs to avoid this issue.
>
> This is an long standing bug since Linux v6.4 but since the last irq
> source is rarely used, it may not be triggered in practice until commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state").
I'm absolutely not convinced that this is the right fix. The handling of
nr_irqs in this driver is completely inconsistent:
1)
static int plic_irq_suspend(void *data)
/* irq ID 0 is reserved */
for (unsigned int i = 1; i < priv->nr_irqs; i++) {
2)
static void plic_irq_resume(void *data)
/* irq ID 0 is reserved */
for (unsigned int i = 1; i < priv->nr_irqs; i++) {
...
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
3)
static irq_hw_number_t cp100_get_hwirq(struct plic_handler *handler, void __iomem *claim)
int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs, 32);
for (i = 0; i < nr_irq_groups; i++) {
4)
static int plic_probe(struct fwnode_handle *fwnode)
priv->prio_save = bitmap_zalloc(nr_irqs, GFP_KERNEL);
...
for (int j = 0; j <= nr_irqs / 32; j++)
writel(0, enable_base + j);
...
handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),...
...
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
...
priv->irqdomain = irq_domain_create_linear(fwnode, nr_irqs + 1,
So can the SIFIVE people please clarify once and forever what's the
actual meaning of nr_irqs:
A) The actual number of hardware interrupts available for drivers
starting from hwirq=1
B) The total number of hardware interrupts, where hwirq=0 is reserved
and therefore the number of interrupts available to drivers is
nr_irqs - 1.
That needs to be clarified first and then subsequently every single
instance which deals with nr_irqs has to be updated accordingly.
A random fix which cures the symptom of an out of bounds memory access
is not helpful at all to cure the underlying root cause of
inconsistency.
Thanks,
tglx
Powered by blists - more mailing lists