lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <DFW7DOC56CUG.3PV4UGDTMUYE1@kernel.org>
Date: Fri, 23 Jan 2026 20:07:44 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Gui-Dong Han" <hanguidong02@...il.com>
Cc: "Jon Hunter" <jonathanh@...dia.com>, "Marek Szyprowski"
 <m.szyprowski@...sung.com>, "Mark Brown" <broonie@...nel.org>,
 <gregkh@...uxfoundation.org>, <rafael@...nel.org>,
 <linux-kernel@...r.kernel.org>, <baijiaju1990@...il.com>, "Qiu-ji Chen"
 <chenqiuji666@...il.com>, <Aishwarya.TCV@....com>,
 "linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v5] driver core: enforce device_lock for
 driver_match_device()

On Fri Jan 23, 2026 at 7:53 PM CET, Gui-Dong Han wrote:
> It seems the issue is simpler than a recursive registration deadlock.
> Looking at the logs, tegra_qspi_probe triggers a NULL pointer
> dereference (Oops) while holding the device_lock. The mutex likely
> remains marked as held/orphaned, blocking subsequent driver bindings
> on the same bus.
>
> This likely explains why lockdep was silent. Since this is not a lock
> dependency cycle or a recursive locking violation, but rather a lock
> remaining held by a terminated task, lockdep would not flag it as a
> deadlock pattern.
>
> This is indeed a side effect of enforcing the lock here—it amplifies
> the impact of a crash. However, an Oops while holding the device_lock
> is generally catastrophic regardless.

This makes sense to me; it might indeed be as simple as that.

> Following up on our previous discussion [1], refactoring
> driver_override would resolve this. We could move driver_override to
> struct device and protect it with a dedicated lock (e.g.,
> driver_override_lock). We would then replace driver_set_override with
> dev_set_driver_override and add dev_access_driver_override with
> internal lock assertions. This allows us to remove device_lock from
> the 2 match paths, reducing contention and preventing a single crash
> from stalling the whole bus.
>
> However, this deviates from the current paradigm where device_lock
> protects sysfs attributes (like waiting_for_supplier and
> power/control). If other sysfs attributes are found to share similar
> constraints or would benefit from finer-grained locking (which
> requires further investigation), we might have a stronger argument for
> introducing a more generic sysfs_lock to handle this class of
> attributes. We would also need to carefully verify safety during
> device removal.
>
> Danilo, what are your thoughts on this refactoring plan? I am willing
> to attempt it, but since it touches the driver core, documentation,
> and 10+ bus drivers, and I haven't submitted such a large series
> before, it may take me a few weeks to get an initial version out, and
> additional time to iterate based on review feedback until it is ready
> for merging. If you prefer to handle it yourself to expedite things,
> please let me know so we don't duplicate efforts.

I think moving driver_override to struct device and providing accessors with
proper lockdep assertions is the correct thing to do. With that, I do not think
a separate lock is necessary.

Please feel free to follow up on this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ