lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANAwSgQFET-vWoOSSMFWs3LZeQMaP+VgU6o2_1Rro6NmGXQbVQ@mail.gmail.com>
Date: Mon, 24 Feb 2025 21:37:14 +0530
From: Anand Moon <linux.amoon@...il.com>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>, 
	Kevin Xie <kevin.xie@...rfivetech.com>, Lorenzo Pieralisi <lpieralisi@...nel.org>, 
	Krzysztof Wilczyński <kw@...ux.com>, 
	Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>, 
	Conor Dooley <conor.dooley@...rochip.com>, Minda Chen <minda.chen@...rfivetech.com>, 
	"open list:PCIE DRIVER FOR STARFIVE JH71x0" <linux-pci@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] PCI: starfive: Fix kmemleak in StarFive PCIe driver's
 IRQ handling

Hi Manivannan,

On Mon, 24 Feb 2025 at 20:12, Manivannan Sadhasivam <mani@...nel.org> wrote:
>
> On Mon, Feb 24, 2025 at 07:33:37PM +0530, Anand Moon wrote:
> > Hi Manivannan
> >
> > On Mon, 24 Feb 2025 at 17:24, Manivannan Sadhasivam
> > <manivannan.sadhasivam@...aro.org> wrote:
> > >
> > > On Mon, Feb 24, 2025 at 03:38:29PM +0530, Anand Moon wrote:
> > > > Hi Manivannan
> > > >
> > > > On Mon, 24 Feb 2025 at 13:31, Manivannan Sadhasivam
> > > > <manivannan.sadhasivam@...aro.org> wrote:
> > > > >
> > > > > On Thu, Feb 20, 2025 at 03:53:31PM +0530, Anand Moon wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > Following the change fix this warning in a kernel memory leak.
> > > > > > Would you happen to have any comments on these changes?
> > > > > >
> > > > > > diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
> > > > > > b/drivers/pci/controller/plda/pcie-plda-host.c
> > > > > > index 4153214ca410..5a72a5a33074 100644
> > > > > > --- a/drivers/pci/controller/plda/pcie-plda-host.c
> > > > > > +++ b/drivers/pci/controller/plda/pcie-plda-host.c
> > > > > > @@ -280,11 +280,6 @@ static u32 plda_get_events(struct plda_pcie_rp *port)
> > > > > >         return events;
> > > > > >  }
> > > > > >
> > > > > > -static irqreturn_t plda_event_handler(int irq, void *dev_id)
> > > > > > -{
> > > > > > -       return IRQ_HANDLED;
> > > > > > -}
> > > > > > -
> > > > > >  static void plda_handle_event(struct irq_desc *desc)
> > > > > >  {
> > > > > >         struct plda_pcie_rp *port = irq_desc_get_handler_data(desc);
> > > > > > @@ -454,13 +449,10 @@ int plda_init_interrupts(struct platform_device *pdev,
> > > > > >
> > > > > >                 if (event->request_event_irq)
> > > > > >                         ret = event->request_event_irq(port, event_irq, i);
> > > > > > -               else
> > > > > > -                       ret = devm_request_irq(dev, event_irq,
> > > > > > -                                              plda_event_handler,
> > > > > > -                                              0, NULL, port);
> > > > >
> > > > > This change is not related to the memleak. But I'd like to have it in a separate
> > > > > patch since this code is absolutely not required, rather pointless.
> > > > >
> > > > Yes, remove these changes to fix the memory leak issue I observed.
> > > >
> > >
> > > Sorry, I don't get you. This specific code change of removing 'devm_request_irq'
> > > is not supposed to fix memleak.
> > >
> > > If you are seeing the memleak getting fixed because of it, then something is
> > > wrong with the irq implementation. You need to figure it out.
> >
> > Declaring request_event_irq in the host controller facilitates the
> > creation of a dedicated IRQ event handler.
> > In its absence, a dummy devm_request_irq was employed, but this
> > resulted in unhandled IRQs and subsequent memory leaks.
>
> What do you mean by 'unhandled IRQs'? There is a dummy IRQ handler invoked to
> handle these IRQs. Even your starfive_event_handler() that you proposed was
> doing the same thing.
>
Yes, but my solution was to work around

> > Eliminating the dummy code eliminated the memory leak logs.
>
>From the code, we are creating a mapping of the IRQ event

     for_each_set_bit(i, &port->events_bitmap, port->num_events) {
                event_irq = irq_create_mapping(port->event_domain, i);
                if (!event_irq) {
                        dev_err(dev, "failed to map hwirq %d\n", i);
                        return -ENXIO;
                }

                if (event->request_event_irq)
                        ret = event->request_event_irq(port,
event_irq, i);   <---
                else
                        ret = devm_request_irq(dev, event_irq,
                                               plda_event_handler,
                                               0, NULL, port);

                if (ret) {
                        dev_err(dev, "failed to request IRQ %d\n", event_irq);
                        return ret;
                }
        }

in the microchip PCIe host we are requesting those IRQ events mapping.

static int mc_request_event_irq(struct plda_pcie_rp *plda, int event_irq,
                                int event)
{
        return devm_request_irq(plda->dev, event_irq, mc_event_handler,
                                0, event_cause[event].sym, plda);
}

static const struct plda_event_ops mc_event_ops = {
        .get_events = mc_get_events,
};

static const struct plda_event mc_event = {
        .request_event_irq = mc_request_event_irq,
        .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
        .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
};

> Sorry, this is not a valid justification. But as I said before, the change
> itself (removing the dummy irq handler and related code) looks good to me as I
> see no need for that. But I cannot accept it as a fix for the memleak.

The StarFive PCIe host lacks the necessary hardware event mapping.
Consequently, the system attempts to handle dummy events, resulting
in observed log messages.

The issue is likely due to devm_request_irq being called with a NULL devname,
preventing proper IRQ mapping.

I have tested on the StarFive JH7110 VisionFive2 RISC-V board.

$ sudo cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffffd6c47c2600 (size 128):
  comm "kworker/u16:2", pid 38, jiffies 4294942263
  hex dump (first 32 bytes):
    cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 4f07ff07):
    __create_object+0x2a/0xfc
    kmemleak_alloc+0x38/0x98
    __kmalloc_cache_noprof+0x296/0x444
    request_threaded_irq+0x168/0x284
    devm_request_threaded_irq+0xa8/0x13c
    plda_init_interrupts+0x46e/0x858
    plda_pcie_host_init+0x356/0x468
    starfive_pcie_probe+0x2f6/0x398
    platform_probe+0x106/0x150
    really_probe+0x30e/0x746
    __driver_probe_device+0x11c/0x2c2
    driver_probe_device+0x5e/0x316
    __device_attach_driver+0x296/0x3a4
    bus_for_each_drv+0x1d0/0x260
    __device_attach+0x1fa/0x2d6
    device_initial_probe+0x14/0x28
unreferenced object 0xffffffd6c47c2900 (size 128):
  comm "kworker/u16:2", pid 38, jiffies 4294942281

>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்

Thanks
-Anand

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ