lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 Jul 2022 15:43:40 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Johan Hovold <johan@...nel.org>
Cc:     Bjorn Helgaas <helgaas@...nel.org>,
        Pali Rohár <pali@...nel.org>,
        Johan Hovold <johan+linaro@...nel.org>,
        Kishon Vijay Abraham I <kishon@...com>,
        Xiaowei Song <songxiaowei@...ilicon.com>,
        Binghui Wang <wangbinghui@...ilicon.com>,
        Thierry Reding <thierry.reding@...il.com>,
        Ryder Lee <ryder.lee@...iatek.com>,
        Jianjun Wang <jianjun.wang@...iatek.com>,
        linux-pci@...r.kernel.org,
        Krzysztof Wilczyński <kw@...ux.com>,
        Ley Foon Tan <ley.foon.tan@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: Why set .suppress_bind_attrs even though .remove() implemented?

On Mon, 25 Jul 2022 14:25:49 +0100,
Johan Hovold <johan@...nel.org> wrote:
> 
> [ +CC: maz ]
> 
> On Fri, Jul 22, 2022 at 09:38:58AM -0500, Bjorn Helgaas wrote:
> > On Fri, Jul 22, 2022 at 03:26:44PM +0200, Johan Hovold wrote:
> > > On Thu, Jul 21, 2022 at 05:21:22PM -0500, Bjorn Helgaas wrote:
> > 
> > > > qcom is a DWC driver, so all the IRQ stuff happens in
> > > > dw_pcie_host_init().  qcom_pcie_remove() does call
> > > > dw_pcie_host_deinit(), which calls irq_domain_remove(), but nobody
> > > > calls irq_dispose_mapping().
> > > > 
> > > > I'm thoroughly confused by all this.  But I suspect that maybe I
> > > > should drop the "make qcom modular" patch because it seems susceptible
> > > > to this problem:
> > > > 
> > > >   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/ctrl/qcom&id=41b68c2d097e
> > > 
> > > That should not be necessary.
> > > 
> > > As you note above, interrupt handling is implemented in dwc core so if
> > > there are any issue here at all, which I doubt, then all of the dwc
> > > drivers that currently can be built as modules would all be broken and
> > > this would need to be fixed in core.
> > 
> > I don't know yet whether there's an issue.  We need a clear argument
> > for why there is or is not.  The fact that others might be broken is
> > not an argument for breaking another one ;)
> 
> It's not breaking anything that is currently working, and if there's
> some corner case during module unload, that's not the end of the world
> either.

It may not be the end of the world for you, but you have absolutely no
idea of what dangling pointers to kernel memory will do on a user
machine, nor how this can be further exploited. Unloading a module
should never result in an unsafe kernel.

> It's a feature useful for developers and no one expects remove code to
> be perfect (e.g. resilient against someone trying to break it by doing
> things in parallel, etc.).

If that's a feature for you while you are developing, then please keep
this change as part of your own hacking toolbox. IMO the upstream
kernel shouldn't be subjected to this.

> 
> > > I've been using the modular pcie-qcom patch for months now, unloading
> > > and reloading the driver repeatedly to test power sequencing, without
> > > noticing any problems whatsoever.
> > 
> > Pali's commit log suggests that unloading the module is not, by
> > itself, enough to trigger the problem:
> > 
> >   https://lore.kernel.org/linux-pci/20220709161858.15031-1-pali@kernel.org/
> > 
> > Can you test the scenario he mentions?
> 
> Turns out the pcie-qcom driver does not support legacy interrupts so
> there's no risk of there being any lingering mappings if I understand
> things correctly.

It still does MSIs, thanks to dw_pcie_host_init(). If you can remove
the driver while devices are up and running with MSIs allocated,
things may get ugly if things align the wrong way (if a driver still
has a reference to an irq_desc or irq_data, for example).

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ