lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALbr=Laqv6pVPfhWtvEdC1hayEvhePBFKq5t+GuNvBCCdDxaBw@mail.gmail.com>
Date: Tue, 13 Jan 2026 20:42:04 +0800
From: Gui-Dong Han <hanguidong02@...il.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: gregkh@...uxfoundation.org, rafael@...nel.org, 
	linux-kernel@...r.kernel.org, baijiaju1990@...il.com, 
	Qiu-ji Chen <chenqiuji666@...il.com>
Subject: Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()

On Tue, Jan 13, 2026 at 5:55 PM Danilo Krummrich <dakr@...nel.org> wrote:
>
> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index 86fa7fbb3548..72791125de91 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
> >  static inline int driver_match_device(const struct device_driver *drv,
> >                                     struct device *dev)
> >  {
> > +     /* Protects against driver_set_override() races */
> > +     device_lock_assert(dev);
> > +
> >       return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> >  }
>
> I am not convinced that this is the correct fix, since
>
>   1. Not all match() callbacks access the driver_override field,
>
>   2. driver_override is accessed in other places as well,
>
>   3. driver_override is a bus device specific field (with a common
>      helper admittedly).
>
> I think it would be better to make driver_override a field in the base
> struct device. This way we can not only provide driver_set_override(), but also
> driver_get_override(), which should contain the device_lock_assert() instead.

Hi Danilo,

Thanks for the feedback. While I agree that moving driver_override to
struct device is a good idea for future refactoring, I believe this
patch is necessary as the immediate fix due to existing locking
constraints.

The main constraint is the device_lock. Currently,
driver_match_device() is called from three paths. One of them
(__device_attach_driver) already holds device_lock to protect the
whole binding process. We cannot add locking inside the bus match()
callbacks (or after refactoring, acquiring the lock to call a
hypothetical getter inside match) because it would deadlock this
existing path. Since we cannot remove the lock from
__device_attach_driver, the only safe solution is to ensure the lock
is held at the other two call sites. Moving the field to struct device
would not change this locking asymmetry. The device_lock is a coarse
lock intended for correctness here, and contention is low.

I understand you might be concerned about performance. I tested this
using ftrace. driver_match_device() is a cold path (called ~600 times
during boot in my setup). I observed no measurable boot time
regression. After boot, it is rarely called (e.g., module loading). So
I think the impact is negligible.

Regarding other access sites: driver_override is specifically designed
for matching. Apart from sysfs (which I am fixing separately [1]), I
have not found other places that require locking. Other accesses occur
mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev)
where concurrency is not an issue. If there are other specific race
conditions, we can fix them individually as they are not universal. In
contrast, the match() path is the critical concurrent path where the
feature is actually implemented, affecting over 10 buses.

Even if we refactor (e.g., adding dev_access_driver_override), we
would still need to hold the lock at these call sites for the reasons
above. This patch is the minimal fix to close the UAF. Refactoring
involves the core, documentation, and multiple bus drivers, which will
take time to discuss and design. I suggest merging this fix first, and
we can explore the refactoring as a follow-up.

[1] https://lore.kernel.org/all/20251202174948.12693-1-hanguidong02@gmail.com/

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ