lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 Apr 2024 15:18:08 +0200
From: Eugeniu Rosca <erosca@...adit-jv.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
CC: Eugeniu Rosca <erosca@...adit-jv.com>, Dirk Behme
	<dirk.behme@...bosch.com>, <linux-kernel@...r.kernel.org>, Rafael J Wysocki
	<rafael@...nel.org>, <syzbot+ffa8143439596313a85a@...kaller.appspotmail.com>,
	Eugeniu Rosca <eugeniu.rosca@...ch.com>, Eugeniu Rosca
	<roscaeugeniu@...il.com>
Subject: Re: [PATCH] drivers: core: Make dev->driver usage safe in
 dev_uevent()

Hello Greg,

On Tue, Apr 30, 2024 at 10:27:19AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Apr 30, 2024 at 10:17:54AM +0200, Eugeniu Rosca wrote:
> > Hi Greg,
> > 
> > On Tue, Apr 30, 2024 at 09:20:10AM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, Apr 30, 2024 at 06:55:31AM +0200, Dirk Behme wrote:
> > > > Inspired by the function dev_driver_string() in the same file make sure
> > > > in case of uninitialization dev->driver is used safely in dev_uevent(),
> > > > as well.
> > > 
> > > I think you are racing and just getting "lucky" with your change here,
> > > just like dev_driver_string() is doing there (that READ_ONCE() really
> > > isn't doing much to protect you...)
> > 
> > I hope below details shed more details on the repro:
> > https:// gist.github.com/erosca/1e8a87fbcc9e5ad0fecd32ebcb6266c3
> 
> Sometimes I only have access to email, nothing else, please include in
> the email the full details.

Will follow your preference in the future.

> 
> > To improve the occurrence rate:
> >  - a dummy ds90ux9xx-dummy driver was used
> >  - a dummy i2c node was added to DTS
> >  - a dummy pr_alert() was added to dev_uevent() @ drivers/base/core.c
> >  - UBSAN + KASAN enabled in .config
> 
> So this is entirely fake?  No real device or driver ever causes this
> problem?

Of course not. This issue is impacting the end user by resetting the HW
target once in a couple of months. Our synthetic test-case tries to
emulate the end user's scenario, for quicker reproduction & validation
of potential/candidate solutions.

Dirk's synthetic scenario leads to the same logs as shared by the user.
Based on that evidence, we believe we found the root cause.

> 
> Why would you add a pr_alert() call?  What is that for?
> 
> totally confused.

pr_alert() acts as a simple delay, accelerating the reproduction.

> 
> 
> > 
> > > > This change is based on the observation of the following race condition:
> > > > 
> > > > Thread #1:
> > > > ==========
> > > > 
> > > > really_probe() {
> > > > ...
> > > > probe_failed:
> > > > ...
> > > > device_unbind_cleanup(dev) {
> > > >       ...
> > > >       dev->driver = NULL;   // <= Failed probe sets dev->driver to NULL
> > > >       ...
> > > >       }
> > > > ...
> > > > }
> > > > 
> > > > Thread #2:
> > > > ==========
> > > > 
> > > > dev_uevent() {
> > > 
> > > Wait, how can dev_uevent() be called if probe fails?  Who is calling
> > > that?
> > 
> > dev_uevent() is called by reading /sys/bus/i2c/devices/<dev>/uevent.
> > Not directly triggered by the probe failure.
> > Please, kindly check the above gist/notes.
> 
> Again, put the info in the email so we can properly quote and read it,
> and it's present for the history involved (web pages disappear, email is
> for forever.)

Agreed & will follow in the future (did not want to clutter the e-mail)

> 
> So you have userspace hammering on a uevent file?  Why is it being
> called if userspace hasn't even been notified that the device has a
> driver bound to it yet?  What causes this action?

We know that uevent subsystem is involved, based on the post-mortem logs.
Hence, reading sysfs was the easiest way to translate the real-life
use-case to a synthetic one.

> > 
> > [--- cut ---]
> > 
> > > > -	if (dev->driver)
> > > > -		add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> > > > +	/* dev->driver can change to NULL underneath us because of unbinding
> > > > +	 * or failing probe(), so be careful about accessing it.
> > > > +	 */
> > > > +	drv = READ_ONCE(dev->driver);
> > > > +	if (drv)
> > > > +		add_uevent_var(env, "DRIVER=%s", drv->name);
> > > 
> > > Again, you are just reducing the window here.  Maybe a bit, but not all
> > > that much overall as there is no real lock present.
> > 
> > The main objective of the patch is to "cache" dev->driver, such
> > that it is not cleared asynchronously from a parallel thread.
> > A refined/minimal locking alternative (if feasible) is welcome.
> 
> "cacheing" a stale pointer still results in a stale pointer :(

Agreed. So, likely minimal/least-intrusive locking will be necessary.

> 
> thanks,
> 
> greg k-h

BR, Eugeniu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ