[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_AEBB719FAF49D05B5BDF7118D729463F6405@qq.com>
Date: Wed, 4 Feb 2026 01:17:07 +0800
From: Yangyu Chen <cyy@...self.name>
To: linux-riscv@...ts.infradead.org
Cc: linux-kernel@...r.kernel.org,
Anup Patel <anup.patel@....qualcomm.com>,
Samuel Holland <samuel.holland@...ive.com>,
Charles Mirabile <cmirabil@...hat.com>,
Lucas Zampieri <lzampier@...hat.com>,
Thomas Gleixner <tglx@...nel.org>,
Paul Walmsley <pjw@...nel.org>,
Palmer Dabbelt <palmer@...belt.com>,
Mason Huo <mason.huo@...rfivetech.com>,
Zhang Xincheng <zhangxincheng@...rarisc.com>,
Charlie Jenkins <charlie@...osinc.com>,
Marc Zyngier <maz@...nel.org>,
Sia Jee Heng <jeeheng.sia@...rfivetech.com>,
Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
Krzysztof Kozlowski <krzk+dt@...nel.org>,
Rob Herring <robh@...nel.org>,
Conor Dooley <conor+dt@...nel.org>,
Alexandre Ghiti <alex@...ti.fr>,
devicetree@...r.kernel.org,
Yash Shah <yash.shah@...ive.com>,
Jia Wang <wangjia@...rarisc.com>,
Yangyu Chen <cyy@...self.name>
Subject: [PATCH v3 0/2] irqchip/sifive-plic: Fix wrong nr_irqs handling
This patch series fixes long standing bugs in sifive-plic driver regarding
the handling of nr_irqs. Some code assumes the first irq source is 0 while
some assumes it is 1. Since the first irq source is actually 1, this causes
various issues including memory corruption when the number of irqs is
multiple of 32. Also, some code assumes nr_irqs is the maximum irq source
ID while some assumes it is the total number of irq sources including the
reserved source 0. This patch series standardizes the handling of nr_irqs
to be the maximum irq source ID, and the first irq source is 1.
This bug can be reproduced by modifying the PLIC node in DT to have ndev as
exactly multiple of 32, e.g., 32, 64, etc., then triggering some interrupts
and checking dmesg for memory corruption:
plic: plic@...00000 {
compatible = "riscv,plic0";
reg = <0x0 0x3c000000 0x0 0x4000000>;
#interrupt-cells = <1>;
interrupt-controller;
interrupts-extended = <&cpu0_intc 11>, <&cpu0_intc 9>;
riscv,max-priority = <7>;
riscv,ndev = <64>;
};
Here is an example dmesg log when ndev is 64:
[ 0.077196] Unable to handle kernel paging request at virtual address ffffaf8000000000
[ 0.077205] Current swapper/0 pgtable: 4K pagesize, 48-bit VAs, pgdp=0x0000000081c2d000
[ 0.077215] [ffffaf8000000000] pgd=000000009ffffc01, p4d=000000009ffffc01, pud=000000009ffff801, pmd=000000009ffff401, pte=0000000000000000
[ 0.077240] Oops [#1]
[ 0.077246] Modules linked in:
[ 0.077254] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-rc6 #36 NONE
[ 0.077266] Hardware name: XiangShan (DT)
[ 0.077273] epc : __kmalloc_node_track_caller_noprof+0x1a0/0x524
[ 0.077284] ra : kstrdup+0x32/0x60
[ 0.077293] epc : ffffffff80253c70 ra : ffffffff801fa70e sp : ffff8f800000b700
[ 0.077304] gp : ffffffff81a1b580 tp : ffffaf8080158000 t0 : 0000000000000264
[ 0.077313] t1 : 0000000000000003 t2 : 0000000000000000 s0 : ffff8f800000b750
[ 0.077323] s1 : 0000000000000002 a0 : ffffaf8000000000 a1 : 0000000000000cc0
[ 0.077332] a2 : ffff8d800200bfc0 a3 : ffffffff81a5c5e0 a4 : ffffaf8000000000
[ 0.077342] a5 : 0000000000000003 a6 : ffffffffffffffff a7 : ffffaf8080001400
[ 0.077352] s2 : ffffaf80802ff178 s3 : ffffffff810107f0 s4 : 0000000000000000
[ 0.077362] s5 : 0000000000000000 s6 : ffff8f800000b9c0 s7 : ffffaf8080823200
[ 0.077372] s8 : ffffffff81a20580 s9 : ffffaf808012c990 s10: ffffffffffffffff
[ 0.077382] s11: 0000000000000000 t3 : 0000000000000cc0 t4 : ffffffff801fa764
[ 0.077391] t5 : 0000000000000000 t6 : 0000000000000263
[ 0.077399] status: 0000000200000120 badaddr: ffffaf8000000000 cause: 000000000000000d
[ 0.077409] [<ffffffff80253c70>] __kmalloc_node_track_caller_noprof+0x1a0/0x524
[ 0.077422] [<ffffffff801fa70e>] kstrdup+0x32/0x60
[ 0.077433] [<ffffffff801fa764>] kstrdup_const+0x28/0x34
[ 0.077444] [<ffffffff80318438>] __kernfs_new_node+0x3c/0x274
[ 0.077457] [<ffffffff80318a90>] kernfs_new_node+0x44/0x6c
[ 0.077470] [<ffffffff80318f40>] kernfs_create_dir_ns+0x20/0x7c
[ 0.077483] [<ffffffff8031b8f8>] sysfs_create_dir_ns+0x60/0xcc
[ 0.077497] [<ffffffff80b41bea>] kobject_add_internal+0xae/0x2d8
[ 0.077509] [<ffffffff80b422d6>] kobject_add+0x52/0xb8
[ 0.077520] [<ffffffff80b6401c>] __irq_alloc_descs+0x190/0x328
[ 0.077534] [<ffffffff800976de>] irq_domain_alloc_descs.part.0+0x46/0x78
[ 0.077549] [<ffffffff8009827a>] irq_create_mapping_affinity+0x72/0xcc
[ 0.077561] [<ffffffff805d27d2>] plic_probe+0x2e2/0x6c8
[ 0.077573] [<ffffffff805d2bc8>] plic_platform_probe+0x10/0x18
Changes since v2:
- Clarify the riscv,ndev meaning in the devicetree binding
documentation for PLIC.
- Fix the entire driver code to have all nr_irqs handling consistent
with the standard definition.
v2: https://lore.kernel.org/lkml/tencent_A697393AE256C4288768342AF245099A690A@qq.com/
Changes since v1:
- Add more Fixes tags for earlier commits that are also affected by this
bug.
- Add more explanation about the bug's history.
v1: https://lore.kernel.org/lkml/tencent_6E9A1A3DF88005E3B4A11C4D7039637E4309@qq.com/
Yangyu Chen (2):
irqchip/sifive-plic: Fix wrong nr_irqs handling
dt-binding: riscv: Clarify the riscv,ndev meaning in PLIC
.../interrupt-controller/sifive,plic-1.0.0.yaml | 2 ++
drivers/irqchip/irq-sifive-plic.c | 16 ++++++++--------
2 files changed, 10 insertions(+), 8 deletions(-)
--
2.51.0
Powered by blists - more mailing lists