linux-kernel - Re: [PATCH 3/3] rust: devres: fix race in Devres::drop()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aErxSYp0AsHGWt0E@tardis.local>
Date: Thu, 12 Jun 2025 08:24:57 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: gregkh@...uxfoundation.org, rafael@...nel.org, ojeda@...nel.org,
	alex.gaynor@...il.com, gary@...yguo.net, bjorn3_gh@...tonmail.com,
	benno.lossin@...ton.me, a.hindborg@...nel.org, aliceryhl@...gle.com,
	tmgross@...ch.edu, chrisi.schrefl@...il.com,
	rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] rust: devres: fix race in Devres::drop()

On Wed, Jun 04, 2025 at 11:49:54AM +0200, Danilo Krummrich wrote:
> On Tue, Jun 03, 2025 at 04:26:01PM -0700, Boqun Feng wrote:
> > On Tue, Jun 03, 2025 at 10:48:52PM +0200, Danilo Krummrich wrote:
> > > In Devres::drop() we first remove the devres action and then drop the
> > > wrapped device resource.
> > > 
> > > The design goal is to give the owner of a Devres object control over when
> > > the device resource is dropped, but limit the overall scope to the
> > > corresponding device being bound to a driver.
> > > 
> > > However, there's a race that was introduced with commit 8ff656643d30
> > > ("rust: devres: remove action in `Devres::drop`"), but also has been
> > > (partially) present from the initial version on.
> > > 
> > > In Devres::drop(), the devres action is removed successfully and
> > > subsequently the destructor of the wrapped device resource runs.
> > > However, there is no guarantee that the destructor of the wrapped device
> > > resource completes before the driver core is done unbinding the
> > > corresponding device.
> > > 
> > > If in Devres::drop(), the devres action can't be removed, it means that
> > > the devres callback has been executed already, or is still running
> > > concurrently. In case of the latter, either Devres::drop() wins revoking
> > > the Revocable or the devres callback wins revoking the Revocable. If
> > > Devres::drop() wins, we (again) have no guarantee that the destructor of
> > > the wrapped device resource completes before the driver core is done
> > > unbinding the corresponding device.
> > > 
> > > Depending on the specific device resource, this can potentially lead to
> > > user-after-free bugs.
> > > 
> > 
> > This all sounds reasonable, one question though: it seems to me the
> > problem exists only for the device resources that expect the device
> > being bounded, so hypothetically if the device resources can be
> > programmed against unbound devices, then the current behavior should be
> > fine?
> 
> I don't think that such device resources exist from a semantical point of view.
> 
> We always have to guarantee that a driver released the device resources once the
> corresponding device is unbound from the driver.
> 
> However, there certainly are differences between how fatal it is if we don't do
> so.
> 
> Complementing your example below, if we for instance fail to release a memory
> region in time, a subsequent driver probing the device may fail requesting the
> corresponding region.
> 
> > For example, in your case, you want free_irq() to happen before
> > the device becomes unbound, which is of course reasonable, but it sounds
> > more like a design choice (or what device model we want to use), because
> > hypothetically you can program an irq that still works even if the
> > device is unbound, no?
> 
> You can, just like for every other registration (e.g. class devices, such as
> misc device), but it's sub-optimal, since then we could not treat the
> registering device of the registration as &Device<Bound>, which allows direct
> access to device resources with Devres::access(). Please see also [1] and [2].
> 
> We have two (safe and correct) ways to access device resources, one is the
> runtime checked access through Revocable::try_access() (which implies the RCU
> read-side critical section and atomic check); the other one is the compile-time
> checked access through providing a &Device<Bound> as cookie for directy access
> without runtime overhead.
> 
> Wherever possible, we want to enable the latter, which means that registrations
> need to be properly guarded.
> 
> [1] https://lore.kernel.org/lkml/20250530142447.166524-6-dakr@kernel.org/
> [2] https://lore.kernel.org/lkml/20250530142447.166524-7-dakr@kernel.org/
> 

Thanks for the explanation, and sorry I'm a bit late for the response. I
was trying to find a place that we should document this, how about the
diff below:

------------
diff --git a/rust/kernel/devres.rs b/rust/kernel/devres.rs
index 0f79a2ec9474..c8b9754e411b 100644
--- a/rust/kernel/devres.rs
+++ b/rust/kernel/devres.rs
@@ -31,7 +31,8 @@ struct DevresInner<T> {
 /// manage their lifetime.
 ///
 /// [`Device`] bound resources should be freed when either the resource goes out of scope or the
-/// [`Device`] is unbound respectively, depending on what happens first.
+/// [`Device`] is unbound respectively, depending on what happens first. And if the resource goes
+/// out of scope first, [`Device`] unbinding will wait until the resource being freed.
 ///
 /// To achieve that [`Devres`] registers a devres callback on creation, which is called once the
 /// [`Device`] is unbound, revoking access to the encapsulated resource (see also [`Revocable`]).


Regards,
Boqun

> > Again this sounds reasonable to me, just want to check my understanding
> > here.
> > 
> > Regards,
> > Boqun
> > 
> > > In order to fix this, implement the following logic.
> > > 
> > > In the devres callback, we're always good when we get to revoke the
> > > device resource ourselves, i.e. Revocable::revoke() returns true.
> > > 
> > > If Revocable::revoke() returns false, it means that Devres::drop(),
> > > concurrently, already drops the device resource and we have to wait for
> > > Devres::drop() to signal that it finished dropping the device resource.
> > > 
> > > Note that if we hit the case where we need to wait for the completion of
> > > Devres::drop() in the devres callback, it means that we're actually
> > > racing with a concurrent Devres::drop() call, which already started
> > > revoking the device resource for us. This is rather unlikely and means
> > > that the concurrent Devres::drop() already started doing our work and we
> > > just need to wait for it to complete it for us. Hence, there should not
> > > be any additional overhead from that.
> > > 
> > > (Actually, for now it's even better if Devres::drop() does the work for
> > > us, since it can bypass the synchronize_rcu() call implied by
> > > Revocable::revoke(), but this goes away anyways once I get to implement
> > > the split devres callback approach, which allows us to first flip the
> > > atomics of all registered Devres objects of a certain device, execute a
> > > single synchronize_rcu() and then drop all revocable objects.)
> > > 
> > > In Devres::drop() we try to revoke the device resource. If that is *not*
> > > successful, it means that the devres callback already did and we're good.
> > > 
> > > Otherwise, we try to remove the devres action, which, if successful,
> > > means that we're good, since the device resource has just been revoked
> > > by us *before* we removed the devres action successfully.
> > > 
> > > If the devres action could not be removed, it means that the devres
> > > callback must be running concurrently, hence we signal that the device
> > > resource has been revoked by us, using the completion.
> > > 
> > > This makes it safe to drop a Devres object from any task and at any point
> > > of time, which is one of the design goals.
> > > 
> > [...]