[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANAwSgRvT-Mqj3XPrME6oKhYmnCUZLnwHfFHmSL=PK+xVLHAqw@mail.gmail.com>
Date: Thu, 20 Feb 2025 15:53:31 +0530
From: Anand Moon <linux.amoon@...il.com>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc: Kevin Xie <kevin.xie@...rfivetech.com>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof Wilczyński <kw@...ux.com>,
Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Conor Dooley <conor.dooley@...rochip.com>, Minda Chen <minda.chen@...rfivetech.com>,
"open list:PCIE DRIVER FOR STARFIVE JH71x0" <linux-pci@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] PCI: starfive: Fix kmemleak in StarFive PCIe driver's
IRQ handling
Hi Manivannan,
On Thu, 20 Feb 2025 at 13:43, Anand Moon <linux.amoon@...il.com> wrote:
>
> Hi Manivannan
>
> On Fri, 14 Feb 2025 at 11:39, Manivannan Sadhasivam
> <manivannan.sadhasivam@...aro.org> wrote:
> >
> > On Tue, Feb 11, 2025 at 01:09:04AM +0530, Anand Moon wrote:
> > > Hi Manivannan
> > >
> > > On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
> > > <manivannan.sadhasivam@...aro.org> wrote:
> > > >
> > > > On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > > > > kmemleak reported following debug log
> > > > >
> > > > > $ sudo cat /sys/kernel/debug/kmemleak
> > > > > unreferenced object 0xffffffd6c47c2600 (size 128):
> > > > > comm "kworker/u16:2", pid 38, jiffies 4294942263
> > > > > hex dump (first 32 bytes):
> > > > > cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff .|Z.....@.......
> > > > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > > > > backtrace (crc 4f07ff07):
> > > > > __create_object+0x2a/0xfc
> > > > > kmemleak_alloc+0x38/0x98
> > > > > __kmalloc_cache_noprof+0x296/0x444
> > > > > request_threaded_irq+0x168/0x284
> > > > > devm_request_threaded_irq+0xa8/0x13c
> > > > > plda_init_interrupts+0x46e/0x858
> > > > > plda_pcie_host_init+0x356/0x468
> > > > > starfive_pcie_probe+0x2f6/0x398
> > > > > platform_probe+0x106/0x150
> > > > > really_probe+0x30e/0x746
> > > > > __driver_probe_device+0x11c/0x2c2
> > > > > driver_probe_device+0x5e/0x316
> > > > > __device_attach_driver+0x296/0x3a4
> > > > > bus_for_each_drv+0x1d0/0x260
> > > > > __device_attach+0x1fa/0x2d6
> > > > > device_initial_probe+0x14/0x28
> > > > > unreferenced object 0xffffffd6c47c2900 (size 128):
> > > > > comm "kworker/u16:2", pid 38, jiffies 4294942281
> > > > >
> > > > > This patch addresses a kmemleak reported during StarFive PCIe driver
> > > > > initialization. The leak was observed with kmemleak reporting
> > > > > unreferenced objects related to IRQ handling. The backtrace pointed
> > > > > to the `request_threaded_irq` and related functions within the
> > > > > `plda_init_interrupts` path, indicating that some allocated memory
> > > > > related to IRQs was not being properly freed.
> > > > >
> > > > > The issue was that while the driver requested IRQs, it wasn't
> > > > > correctly handling the lifecycle of the associated resources.
> > > >
> > > > What resources?
> > > >
> > > The Microchip PCIe host driver [1] handles PCI, SEC, DEBUG, and LOCAL
> > > hardware events
> > > through a dedicated framework [2]. This framework allows the core driver [3]
> > > to monitor and wait for these specific events.
> > >
> >
> > Microchip driver also has its own 'event_ops' and 'event_irq_chip', so defining
> > 'request_event_irq()' callback makes sense to me.
> >
> > > [1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
> > > [2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
> > > [3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466
> > >
> > > StarFive PCIe driver currently lacks the necessary `request_event_irq`
> > > implementation
> > > to integrate with this event-handling mechanism, which prevents the core driver
> > > from properly responding to these events on StarFive platforms.
> > >
> > > static const struct plda_event mc_event = {
> > > . request_event_irq = mc_request_event_irq,
> > > .intx_event = EVENT_LOCAL_PM_MSI_INT_INTX,
> > > .msi_event = EVENT_LOCAL_PM_MSI_INT_MSI,
> > > };
> > >
> > > This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
> > > Since the core [3] driver is monitoring for these events
> > >
> >
> > This still doesn't make sense to me. Under what condition you observed the
> > kmemleak? Since it points to devm_request_irq(), I can understand that the
> > memory allocated for the IRQ is not freed. But when does it happen?
> >
> The PCIe host driver is responsible for monitoring the PLDA PCIe EVENT irq.
> However, I have been unable to locate the register map necessary to
> implement the
> required code updates,
>
> alarm@...hriscv:/media/nvme0/mainline/linux-riscv-6.y-devel$ cat
> /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 10: 101145 76126 79040 80218 RISC-V INTC 5 Edge
> riscv-timer
> 12: 3 0 0 0 SiFive PLIC 111 Edge
> 17030000.power-controller
> 13: 20 0 0 0 SiFive PLIC 30 Edge
> 1600c000.rng
> 14: 0 0 0 0 SiFive PLIC 1 Edge
> ccache_ecc
> 15: 0 0 0 0 SiFive PLIC 3 Edge
> ccache_ecc
> 16: 0 0 0 0 SiFive PLIC 4 Edge
> ccache_ecc
> 17: 0 0 0 0 SiFive PLIC 2 Edge
> ccache_ecc
> 20: 0 0 0 0 SiFive PLIC 73 Edge
> dw_axi_dmac_platform
> 21: 1694 0 0 0 SiFive PLIC 32 Edge ttyS0
> 22: 0 0 0 0 SiFive PLIC 35 Edge
> 10030000.i2c
> 23: 0 0 0 0 SiFive PLIC 75 Edge
> dw-mci
> 24: 0 0 0 0 SiFive PLIC 37 Edge
> 10050000.i2c
> 25: 171 0 0 0 SiFive PLIC 50 Edge
> 12050000.i2c
> 26: 0 0 0 0 SiFive PLIC 51 Edge
> 12060000.i2c
> 27: 10419 0 0 0 SiFive PLIC 74 Edge
> dw-mci
> 28: 6 0 0 0 SiFive PLIC 25 Edge
> 13010000.spi
> 29: 0 0 0 0 SiFive PLIC 38 Edge pl022
> 31: 0 0 0 0 PLDA PCIe EVENT 16
> Edge 940000000.pcie
> 32: 0 0 0 0 PLDA PCIe EVENT 17
> Edge 940000000.pcie
> 33: 0 0 0 0 PLDA PCIe EVENT 18
> Edge 940000000.pcie
> 34: 0 0 0 0 PLDA PCIe EVENT 20
> Edge 940000000.pcie
> 35: 0 0 0 0 PLDA PCIe EVENT 21
> Edge 940000000.pcie
> 36: 0 0 0 0 PLDA PCIe EVENT 22
> Edge 940000000.pcie
> 39: 0 0 0 0 PLDA PCIe EVENT 26
> Edge 940000000.pcie
> 40: 0 0 0 0 PLDA PCIe EVENT 27
> Edge 940000000.pcie
> 41: 0 0 0 0 17020000.pinctrl 41
> Edge 16020000.mmc cd
> 42: 0 0 0 0 PLDA PCIe EVENT 28
> Edge 940000000.pcie
> 44: 0 0 0 0 PLDA PCIe MSI 0
> Edge PCIe PME, aerdrv, PCIe bwctrl
> 46: 0 0 0 0 PLDA PCIe EVENT 16
> Edge 9c0000000.pcie
> 47: 0 0 0 0 PLDA PCIe EVENT 17
> Edge 9c0000000.pcie
> 48: 0 0 0 0 PLDA PCIe EVENT 18
> Edge 9c0000000.pcie
> 49: 0 0 0 0 PLDA PCIe EVENT 20
> Edge 9c0000000.pcie
> 50: 0 0 0 0 PLDA PCIe EVENT 21
> Edge 9c0000000.pcie
> 51: 0 0 0 0 PLDA PCIe EVENT 22
> Edge 9c0000000.pcie
> 54: 0 0 0 0 PLDA PCIe EVENT 26
> Edge 9c0000000.pcie
> 55: 0 0 0 0 PLDA PCIe EVENT 27
> Edge 9c0000000.pcie
> 56: 0 0 0 0 PLDA PCIe EVENT 28
> Edge 9c0000000.pcie
> 58: 0 0 0 0 PLDA PCIe MSI
> 134217728 Edge PCIe PME, aerdrv, PCIe bwctrl
>
> Yep, Thanks for your review comment.
>
> The following changes fix the kernel memory leak warning at my end.
>
> $ git diff
> diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
> b/drivers/pci/controller/plda/pcie-plda-host.c
> index 4153214ca410..d4e7c1b49607 100644
> --- a/drivers/pci/controller/plda/pcie-plda-host.c
> +++ b/drivers/pci/controller/plda/pcie-plda-host.c
> @@ -461,6 +461,7 @@ int plda_init_interrupts(struct platform_device *pdev,
>
> if (ret) {
> dev_err(dev, "failed to request IRQ %d\n", event_irq);
> + irq_dispose_mapping(event_irq);
> return ret;
> }
> }
Following the change fix this warning in a kernel memory leak.
Would you happen to have any comments on these changes?
diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
b/drivers/pci/controller/plda/pcie-plda-host.c
index 4153214ca410..5a72a5a33074 100644
--- a/drivers/pci/controller/plda/pcie-plda-host.c
+++ b/drivers/pci/controller/plda/pcie-plda-host.c
@@ -280,11 +280,6 @@ static u32 plda_get_events(struct plda_pcie_rp *port)
return events;
}
-static irqreturn_t plda_event_handler(int irq, void *dev_id)
-{
- return IRQ_HANDLED;
-}
-
static void plda_handle_event(struct irq_desc *desc)
{
struct plda_pcie_rp *port = irq_desc_get_handler_data(desc);
@@ -454,13 +449,10 @@ int plda_init_interrupts(struct platform_device *pdev,
if (event->request_event_irq)
ret = event->request_event_irq(port, event_irq, i);
- else
- ret = devm_request_irq(dev, event_irq,
- plda_event_handler,
- 0, NULL, port);
if (ret) {
dev_err(dev, "failed to request IRQ %d\n", event_irq);
+ irq_dispose_mapping(event_irq);
return ret;
}
}
> > > > > This patch introduces an event IRQ handler and the necessary
> > > > > infrastructure to manage these IRQs, preventing the memory leak.
> > > > >
> > > >
> > > > These handles appear pointless to me. What purpose are they serving?
> > > >
> > > Yes, you are correct, the core event monitoring framework [3] triggered a
> > > kernel memory leak. This patch adds a dummy IRQ callback as a
> > > placeholder for proper event handling, which can be implemented in a
> > > future patch.
> > >
> >
> > The dummy request_event_irq() callback is not supposed to be needed in the first
> > place. So clearly, this patch is not fixing the actual memory leak but trying to
> > cover it up.
> You are correct.
> >
> > - Mani
> >
> Thanks
> -Anand
Thanks
-Anand
> > --
> > மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists