lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Mar 2021 12:04:34 -0700
From:   Dmitry Torokhov <dmitry.torokhov@...il.com>
To:     Mark Brown <broonie@...nel.org>
Cc:     linux-spi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] spi: ensure timely release of driver-allocated resources

On Tue, Mar 23, 2021 at 05:36:06PM +0000, Mark Brown wrote:
> On Mon, Mar 22, 2021 at 12:38:15PM -0700, Dmitry Torokhov wrote:
> > On Mon, Mar 22, 2021 at 12:37:07PM +0000, Mark Brown wrote:
> 
> > > This feels like it might make sense to push up to the driver core level
> > > then rather than doing in individual buses?
> 
> > That is exactly the issue: we can't. Driver core already releases all
> > resources when a device is being unbound but that happens after bus
> > "remove" code is executed and therefore is too late. The device might
> > already be powered down, but various devm release() callbacks will be
> > trying to access it.
> 
> Can you provide a concrete example of something that is causing problems
> here?  If something is trying to access the device after remove() has
> run that sounds like it's abusing devres somehow.  It sounded from your
> commit log like this was something to do with the amount of time it took
> the driver core to action the frees rather than an ordering issue.

No it is ordering issue. I do not have a proven real-life example for
SPI, but we do have one for I2C:

https://lore.kernel.org/linux-devicetree/20210305041236.3489-7-jeff@labundy.com/

However, if we consider fairly typical SPI driver, such as
drivers/input/touchscreen/ad7877.c, you can see that it uses devm in its
probe() and because all resources are managed, it does not define
remove() at all.

So during proble we have:

<driver core allocations>
SPI: dev_pm_domain_attach
AD7877: devm_kzalloc driver structure
AD7877: devm allocation of input device
AD7877: devm custom action to disable the chip on removal
AD7877: devm IRQ request
AD7877: devm sysfs attribute group
AD7877: devm input registration
<additional devm driver core allocations?>

And on remove:

SPI: dev_pm_domain_detach !!!!!!
<deallocate additional devm driver core allocations?>
AD7877: devm input unregistration
AD7877: devm sysfs attribute group removal
AD7877: devm freeing IRQ
AD7877: devm disable the chip
AD7877: devm freeing of input device
AD7877: devm free driver structure
<deallocate driver core allocations>

Note how dev_pm_domain_detach() jumped ahead of everything, and
strictly speaking past this point we can no longer guarantee that we can
access the chip and disable it.

> 
> > devm only works when you do not mix manual resources with managed ones,
> > and when bus code allocates resources themselves (attaching a device to
> > a power domain can be viewed as resource acquisition) we violate this
> > principle. We could, of course, to make SPI bus' probe() use
> > devm_add_action_or_reset() to work in removal of the device from the
> > power domain into the stream of devm resources, but that still requires
> > changes at bus code, and I believe will complicate matters if we need to
> > extend SPI bus code to allocate more resources in probe(). So I opted
> > for opening a devm group to separate resources allocated before and
> > after probe() to be able to release them in the right order.
> 
> Sure, these are standard issues that people create with excessive use of

devm is a fact of life and we need to live with it. I am unconvinced if
it solved more issues that it brought in, but it is something that
driver authors like to use and are pushed towards.

> devm but the device's remove() callback is surely already a concern by
> itself here?

In the example above there is not one, but even if it exists, it is
called first, so in some limited cases you could have non-managed
resources allocated very last and released first in remove(), and then
have devm release the rest. However driver's remove() is not issue here,
it is bus' non-trivial remove.

Thanks.

-- 
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ