lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220721224825.7aqcvcnzaht57mii@pali>
Date:   Fri, 22 Jul 2022 00:48:25 +0200
From:   Pali Rohár <pali@...nel.org>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Johan Hovold <johan+linaro@...nel.org>,
        Kishon Vijay Abraham I <kishon@...com>,
        Xiaowei Song <songxiaowei@...ilicon.com>,
        Binghui Wang <wangbinghui@...ilicon.com>,
        Thierry Reding <thierry.reding@...il.com>,
        Ryder Lee <ryder.lee@...iatek.com>,
        Jianjun Wang <jianjun.wang@...iatek.com>,
        linux-pci@...r.kernel.org,
        Krzysztof Wilczyński <kw@...ux.com>,
        Ley Foon Tan <ley.foon.tan@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: Why set .suppress_bind_attrs even though .remove() implemented?

On Thursday 21 July 2022 17:21:22 Bjorn Helgaas wrote:
> [+to Johan for qcom]
> [-cc Tom, email bounces]
> 
> On Thu, Jul 21, 2022 at 10:46:07PM +0200, Pali Rohár wrote:
> > On Thursday 21 July 2022 14:54:33 Bjorn Helgaas wrote:
> > > The j721e, kirin, tegra, and mediatek drivers all implement .remove().
> > > 
> > > They also set ".suppress_bind_attrs = true".  I think this means
> > > bus_add_driver() will not create the "bind" and "unbind" sysfs
> > > attributes for the driver that would allow users to users to manually
> > > attach and detach devices from it.
> > > 
> > > Is there a reason for this, or should these drivers stop setting
> > > .suppress_bind_attrs?
> > 
> > I have already asked this question during review of kirin driver:
> > https://lore.kernel.org/linux-pci/20211031205527.ochhi72dfu4uidii@pali/
> > 
> > Microchip driver wanted to change its type from bool to tristate
> > https://lore.kernel.org/linux-pci/20220420093449.38054-1-u.kleine-koenig@pengutronix.de/t/#u
> > and after discussion it seems that it is needed to do more work for this
> > driver.
> > 
> > > For example, Pali and Ley Foon *did* stop setting .suppress_bind_attrs
> > > when adding .remove() methods in these commits:
> > > 
> > >   0746ae1be121 ("PCI: mvebu: Add support for compiling driver as module")
> > >   526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module")
> > >   ec15c4d0d5d2 ("PCI: altera: Allow building as module")
> > 
> > I added it for both pci-mvebu.c and pci-aardvark.c. And just few days
> > ago I realized why suppress_bind_attrs was set to true and remove method
> > was not implemented.
> 
> With suppress_bind_attrs, the user can't manually unbind a device, so
> we can't get to mvebu_pcie_remove() that way, but since mvebu is a
> modular driver, I assume we can unload the module and *that* would
> call mvebu_pcie_remove().  Right?

Yes! It looks like that if module unloading is allowed then .remove
callback would be called despite suppress_bind_attrs was set.

> > Implementing remove method is not really simple, specially when pci
> > controller driver implements also interrupt controller (e.g. for
> > handling legacy interrupts).
> 
> Hmmm.  Based on your patches below, it looks like we need to call
> irq_dispose_mapping() in some cases, but I'm very confused about
> *which* cases.
> 
> I first thought it was for mappings created with irq_create_mapping(),
> but pci-aardvark.c never calls that, so there must be more to it.

>From reading kernel code it looks like that mapping is created when
some other driver request shared INTx IRQ. So it is not done by
pci-aardvark.c but e.g. by wifi driver ath9k (which uses INTx).

> Currently only altera, iproc, mediatek-gen3, and mediatek call
> irq_dispose_mapping() from their .remove() methods.  (They all call
> irq_domain_remove() *before* irq_dispose_mapping().  Is that legal?
> Your patches do irq_dispose_mapping() *first*.)

Documentation says:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/irq/irqdomain.c?h=v5.18#n240

/**
 * irq_domain_remove() - Remove an irq domain.
 * @domain: domain to remove
 *
 * This routine is used to remove an irq domain. The caller must ensure
 * that all mappings within the domain have been disposed of prior to
 * use, depending on the revmap type.
 */

> altera, mediatek-gen3, and mediatek call irq_dispose_mapping() on IRQs
> that came from platform_get_irq().

I think that if IRQ is not used it could be disposed (no need to have it
mapped). So if this IRQ from platform_get_irq() is not shared then in
pci controller driver .remove callback for sure it is not used anymore.
So for non-shared IRQs this looks fine.

> qcom is a DWC driver, so all the IRQ stuff happens in
> dw_pcie_host_init().  qcom_pcie_remove() does call
> dw_pcie_host_deinit(), which calls irq_domain_remove(), but nobody
> calls irq_dispose_mapping().
> 
> I'm thoroughly confused by all this.

Beware that there are IRQs which pci controller uses as "client" and
then there may be also IRQs which pci controller driver allocates via
own IRQ chip driver and provides for other drivers (e.g. as chained irq
handler). So it is needed to distinguish between them. IRQs which
controller driver create via own IRQ chip driver have to be properly
cleaned when calling .remove driver. IRQ which controller driver get via
platform_get_irq() (or similar of method) is under parent driver control
and parent driver is responsible for proper dispose/cleanup.

> But I suspect that maybe I
> should drop the "make qcom modular" patch because it seems susceptible
> to this problem:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/ctrl/qcom&id=41b68c2d097e

In pcie-qcom.c there is no irq reference. So maybe it is safe? But I do
not understand designware driver API design...

The best is probably to do runtime test. Boot kernel with pci=nomsi and
connect some PCIe card which supports legacy INTx interrupt. This should
force usage of INTx. Check that card driver is loaded and in file
/proc/interrupts is IRQ line for that driver. Then rmmod pci controller
driver and check that IRQ line in /proc/interrupts disappeared. And also
check that /sys/kernel/debug/irq/irqs/<num> does not exist.

Also another runtime test is to call rmmod and modprobe of pci
controller driver more times (with PCIe card connected) and check that
IRQ numbers assigned to card in /proc/interrupts are same. If after
rmmod + modprobe sequence are numbers increasing then there is some
IRQ-leak in .remove callback.

> > Here are waiting fixup patches for pci-mvebu.c and pci-aardvark.c which
> > fixes .remove callback. Without these patches calling 'rmmod driver' let
> > dangling pointer in kernel which may cause random kernel crashes. See:
> > 
> > https://lore.kernel.org/linux-pci/20220709161858.15031-1-pali@kernel.org/
> > https://lore.kernel.org/linux-pci/20220711120626.11492-1-pali@kernel.org/
> > https://lore.kernel.org/linux-pci/20220711120626.11492-2-pali@kernel.org/
> > 
> > So I would suggest to do more detailed review when adding .remove
> > callback for pci controller driver (or when remove suppress_bind_attrs)
> > and do more testings and checking if all IRQ mappings are disposed.
> 
> I'm not smart enough to do "more detailed review" because I don't know
> what things to look for :)  Thanks for all your work in sorting out
> these arcane details!
> 
> Bjorn

Anyway, this issue was discovered by Hajo Noerenberg as a "Side note:"
during debugging SATA controller:
https://bugzilla.kernel.org/show_bug.cgi?id=216094#c31

It looks like that on some machines rmmod cause immediate kernel crash
and on other machines it is hard to reproduce.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ