lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date: Fri, 26 Apr 2024 17:53:13 +0200
From: Dirk Behme <dirk.behme@...il.com>
To: Linux kernel mailing list <linux-kernel@...r.kernel.org>
Cc: Dirk Behme <dirk.behme@...bosch.com>
Subject: data-race in dev_uevent / really_probe?

Hi,

debugging a NULL pointer crash on a quite old embedded system kernel 
(4.14.x) we might have found the root cause for

https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a
https://groups.google.com/g/syzkaller-upstream-moderation/c/xTpwi0C6eSY/m/FqJAQtinAQAJ

Looking at the recent kernel, it looks like the relevant code hasn't 
changed that much since then. So even in recent kernel code it looks 
like there is a synchronization issue between dev_uevent() and 
really_probe():

Thread #1:
========

really_probe() {
...
probe_failed:
...
device_unbind_cleanup(dev) {
      ...
      dev->driver = NULL;   // <= Failed probe sets dev->driver to NULL
      ...
      }
..
}

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c#n552


Thread #2:
========

dev_uevent() {
..
if (dev->driver)
                 // If dev->driver is NULLed from really_probe() from 
here on,
                 // after above check, the system crashes
		add_uevent_var(env, "DRIVER=%s", dev->driver->name);
..
}

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/core.c#n2670

The setup is a device driver probe that fails. In our case the probe 
from an I2C driver. While that failing probe does issue some 
dev_info() and dev_err() output. What seems to trigger in our case 
systemd-journal (as given in the groups.google.com link above) which 
calls via the given call stack dev_uevent().

In the end, dev_uevent() has validated dev->driver successfully. But 
if, depending on timing, exactly after this the failing 
(really-)probe() NULLs dev->driver, the system crashes due to using 
dev->driver being NULL then.

Does that make sense? Or have we missed anything?

Best regards

Dirk


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ