[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250917030955.41708-1-deepak.sharma.472935@gmail.com>
Date: Wed, 17 Sep 2025 08:39:55 +0530
From: Deepak Sharma <deepak.sharma.472935@...il.com>
To: gregkh@...uxfoundation.org
Cc: linux-kernel@...r.kernel.org,
linux-kernel-mentees@...ts.linux.dev,
Deepak Sharma <deepak.sharma.472935@...il.com>,
syzbot+6c905ab800f20cf4086c@...kaller.appspotmail.com
Subject: [PATCH] drivers: core: Fix synchronization of removal of device with rpm work
Syzbot reports a use-after-free at `rpm_suspend`, while the free
occurs at the `usb_disconnect`
All line numbers references will be for commit ID
d69eb204c255c35abd9e8cb621484e8074c75eaa
This points to a possible synchronization issue. In `usb_disconnect`
there's a call to `pm_runtime_barrier` but it does nothing more than
acting as a sort of "flush" (while cancelling what's the pending
rpm actions not started yet). There does not seem to be any increase
in device usage count either in this stacktrace after this stacktrace
Then we have an eventual call to `device_del`, which further leads
to a call to `device_pm_remove`. No code synchronizing in any way
so far with the PM system after that `pm_runtime_barrier`
Let's say now that the timer expiration queued work for `rpm_suspend`
executed in this period of absent synchronization. We can create few
interesting situations here, I will address one
Let's say that we unlock the `dev->power.lock` at `rpm_suspend`
work at `drivers/base/power/runtime.c:723` and then the code
`device_pm_remove` proceeds as normal clearing up the device.
Any further calls are not going to cancel the tasks we have pending
and since the lock has been given up, we will proceed, and end up
deleting the device too, which will lead to a use-after-free
as observed.
So at the device removal, we could add a `pm_runtime_forbid`,
followed by a `pm_runtime_barrier`. This leads to the completion of
any pending work and forbids any other new work to be added.
Once we return, we can do `device_pm_remove`. `pm_runtime_forbid`
does not seem to influence the behavior of `device_pm_remove`
(tho it does lead to a call to `pm_runtime_get_noresume()` which
touches the device usage count, but it would still work the same)
Reported-by: syzbot+6c905ab800f20cf4086c@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6c905ab800f20cf4086c
Signed-off-by: Deepak Sharma <deepak.sharma.472935@...il.com>
---
drivers/base/core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index d22d6b23e758..616fd02d18ed 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3876,7 +3876,13 @@ void device_del(struct device *dev)
device_remove_file(dev, &dev_attr_uevent);
device_remove_attrs(dev);
bus_remove_device(dev);
+ /* We need to forbid and then proceed with a barrier here,
+ * so that any pending work is flushed
+ */
+ pm_runtime_forbid(dev);
+ pm_runtime_barrier(dev);
device_pm_remove(dev);
+ pm_runtime_allow(dev);
driver_deferred_probe_del(dev);
device_platform_notify_remove(dev);
device_links_purge(dev);
--
2.51.0
Powered by blists - more mailing lists