[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250514225508.52629-1-mhklinux@outlook.com>
Date: Wed, 14 May 2025 15:55:08 -0700
From: mhkelley58@...il.com
To: haiyangz@...rosoft.com,
wei.liu@...nel.org,
decui@...rosoft.com,
kys@...rosoft.com,
linux-kernel@...r.kernel.org,
linux-hyperv@...r.kernel.org
Subject: [PATCH 1/1] Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir
From: Michael Kelley <mhklinux@...look.com>
The VMBus driver code has some inherent races in the creation of the
"channels" sysfs subdirectory and its per-channel numbered subdirectories.
These races have not generally been recognized or understood. Add some
comments to call them out. No code changes.
Signed-off-by: Michael Kelley <mhklinux@...look.com>
---
drivers/hv/vmbus_drv.c | 42 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 40 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e3d51a316316..4d4f75f578c7 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -707,7 +707,30 @@ static const struct hv_vmbus_device_id *hv_vmbus_get_id(const struct hv_driver *
return id;
}
-/* vmbus_add_dynid - add a new device ID to this driver and re-probe devices */
+/* vmbus_add_dynid - add a new device ID to this driver and re-probe devices
+ *
+ * This function can race with vmbus_device_register(). This function is
+ * typically running on a user thread in response to writing to the "new_id"
+ * sysfs entry for a driver. vmbus_device_register() is running on a
+ * workqueue thread in response to the Hyper-V host offering a device to the
+ * guest. This function calls driver_attach(), which looks for an existing
+ * device matching the new id, and attaches the driver to which the new id
+ * has been assigned. vmbus_device_register() calls device_register(), which
+ * looks for a driver that matches the device being registered. If both
+ * operations are running simultaneously, the device driver probe function runs
+ * on whichever thread establishes the linkage between the driver and device.
+ *
+ * In most cases, it doesn't matter which thread runs the driver probe
+ * function. But if vmbus_device_register() does not find a matching driver,
+ * it proceeds to create the "channels" subdirectory and numbered per-channel
+ * subdirectory in sysfs. While that multi-step creation is in progress, this
+ * function could run the driver probe function. If the probe function checks
+ * for, or operates on, entries in the "channels" subdirectory, including by
+ * calling hv_create_ring_sysfs(), the operation may or may not succeed
+ * depending on the race. The race can't create a kernel failure in VMBus
+ * or device subsystem code, but probe functions in VMBus drivers doing such
+ * operations must be prepared for the failure case.
+ */
static int vmbus_add_dynid(struct hv_driver *drv, guid_t *guid)
{
struct vmbus_dynid *dynid;
@@ -1921,7 +1944,8 @@ static const struct kobj_type vmbus_chan_ktype = {
* ring for userspace to use.
* Note: Race conditions can happen with userspace and it is not encouraged to create new
* use-cases for this. This was added to maintain backward compatibility, while solving
- * one of the race conditions in uio_hv_generic while creating sysfs.
+ * one of the race conditions in uio_hv_generic while creating sysfs. See comments with
+ * vmbus_add_dynid() and vmbus_device_register().
*
* Returns 0 on success or error code on failure.
*/
@@ -2055,6 +2079,20 @@ int vmbus_device_register(struct hv_device *child_device_obj)
return ret;
}
+ /*
+ * If device_register() found a driver to assign to the device, the
+ * driver's probe function has already run at this point. If that
+ * probe function accesses or operates on the "channels" subdirectory
+ * in sysfs, those operations will have failed because the "channels"
+ * subdirectory doesn't exist until the code below runs. Or if the
+ * probe function creates a /dev entry, a user space program could
+ * find and open the /dev entry, and then create a race by accessing
+ * the "channels" subdirectory while the creation steps are in progress
+ * here. The race can't result in a kernel failure, but the user space
+ * program may get an error in accessing "channels" or its
+ * subdirectories. See also comments with vmbus_add_dynid() about a
+ * related race condition.
+ */
child_device_obj->channels_kset = kset_create_and_add("channels",
NULL, kobj);
if (!child_device_obj->channels_kset) {
--
2.25.1
Powered by blists - more mailing lists