[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9ee65987-4353-42c6-b517-d6f52428f718@linux.microsoft.com>
Date: Tue, 25 Feb 2025 14:04:43 +0530
From: Naman Jain <namjain@...ux.microsoft.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: "K . Y . Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>,
Stephen Hemminger <stephen@...workplumber.org>,
linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...nel.org, Saurabh Sengar <ssengar@...ux.microsoft.com>,
Michael Kelley <mhklinux@...look.com>, Long Li <longli@...rosoft.com>
Subject: Re: [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
On 2/25/2025 11:42 AM, Greg Kroah-Hartman wrote:
> On Tue, Feb 25, 2025 at 10:50:01AM +0530, Naman Jain wrote:
>> On regular bootup, devices get registered to vmbus first, so when
>> uio_hv_generic driver for a particular device type is probed,
>> the device is already initialized and added, so sysfs creation in
>> uio_hv_generic probe works fine. However, when device is removed
>> and brought back, the channel rescinds and device again gets
>> registered to vmbus. However this time, the uio_hv_generic driver is
>> already registered to probe for that device and in this case sysfs
>> creation is tried before the device gets initialized completely.
>>
>> Fix this by moving the core logic of sysfs creation for ring buffer,
>> from uio_hv_generic to HyperV's vmbus driver, where rest of the sysfs
>> attributes for the channels are defined. While doing that, make use
>> of attribute groups and macros, instead of creating sysfs directly,
>> to ensure better error handling and code flow.
>>
>> Problem path:
>> vmbus_device_register
>> device_register
>> uio_hv_generic probe
>> sysfs_create_bin_file (fails here)
>> kset_create_and_add (dependency)
>> vmbus_add_channel_kobj (dependency)
>>
>> Fixes: 9ab877a6ccf8 ("uio_hv_generic: make ring buffer attribute for primary channel")
>> Cc: stable@...nel.org
>> Suggested-by: Saurabh Sengar <ssengar@...ux.microsoft.com>
>> Suggested-by: Michael Kelley <mhklinux@...look.com>
>> Signed-off-by: Naman Jain <namjain@...ux.microsoft.com>
>> ---
>> Hi,
>> This is the first patch after initial RFC was posted.
>> https://lore.kernel.org/all/20250214064351.8994-1-namjain@linux.microsoft.com/
>>
>> Changes since RFC patch:
>> * Different approach to solve the problem is proposed (credits to
>> Michael Kelley).
>> * Core logic for sysfs creation moved out of uio_hv_generic, to VMBus
>> drivers where rest of the sysfs attributes for a VMBus channel
>> are defined. (addressed Greg's comments)
>> * Used attribute groups instead of sysfs_create* functions, and bundled
>> ring attribute with other attributes for the channel sysfs.
>>
>> Error logs:
>>
>> [ 35.574120] ------------[ cut here ]------------
>> [ 35.574122] WARNING: CPU: 0 PID: 10 at fs/sysfs/file.c:591 sysfs_create_bin_file+0x81/0x90
>> [ 35.574168] Workqueue: hv_pri_chan vmbus_add_channel_work
>> [ 35.574172] RIP: 0010:sysfs_create_bin_file+0x81/0x90
>> [ 35.574197] Call Trace:
>> [ 35.574199] <TASK>
>> [ 35.574200] ? show_regs+0x69/0x80
>> [ 35.574217] ? __warn+0x8d/0x130
>> [ 35.574220] ? sysfs_create_bin_file+0x81/0x90
>> [ 35.574222] ? report_bug+0x182/0x190
>> [ 35.574225] ? handle_bug+0x5b/0x90
>> [ 35.574244] ? exc_invalid_op+0x19/0x70
>> [ 35.574247] ? asm_exc_invalid_op+0x1b/0x20
>> [ 35.574252] ? sysfs_create_bin_file+0x81/0x90
>> [ 35.574255] hv_uio_probe+0x1e7/0x410 [uio_hv_generic]
>> [ 35.574271] vmbus_probe+0x3b/0x90
>> [ 35.574275] really_probe+0xf4/0x3b0
>> [ 35.574279] __driver_probe_device+0x8a/0x170
>> [ 35.574282] driver_probe_device+0x23/0xc0
>> [ 35.574285] __device_attach_driver+0xb5/0x140
>> [ 35.574288] ? __pfx___device_attach_driver+0x10/0x10
>> [ 35.574291] bus_for_each_drv+0x86/0xe0
>> [ 35.574294] __device_attach+0xc1/0x200
>> [ 35.574297] device_initial_probe+0x13/0x20
>> [ 35.574315] bus_probe_device+0x99/0xa0
>> [ 35.574318] device_add+0x647/0x870
>> [ 35.574320] ? hrtimer_init+0x28/0x70
>> [ 35.574323] device_register+0x1b/0x30
>> [ 35.574326] vmbus_device_register+0x83/0x130
>> [ 35.574328] vmbus_add_channel_work+0x135/0x1a0
>> [ 35.574331] process_one_work+0x177/0x340
>> [ 35.574348] worker_thread+0x2b2/0x3c0
>> [ 35.574350] kthread+0xe3/0x1f0
>> [ 35.574353] ? __pfx_worker_thread+0x10/0x10
>> [ 35.574356] ? __pfx_kthread+0x10/0x10
>>
>> ---
>> drivers/hv/hyperv_vmbus.h | 4 +++
>> drivers/hv/vmbus_drv.c | 62 ++++++++++++++++++++++++++++++++++++
>> drivers/uio/uio_hv_generic.c | 34 ++------------------
>> include/linux/hyperv.h | 3 ++
>> 4 files changed, 72 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
>> index 29780f3a7478..e0c7b75e6c7a 100644
>> --- a/drivers/hv/hyperv_vmbus.h
>> +++ b/drivers/hv/hyperv_vmbus.h
>> @@ -477,4 +477,8 @@ static inline int hv_debug_add_dev_dir(struct hv_device *dev)
>>
>> #endif /* CONFIG_HYPERV_TESTING */
>>
>> +/* Create and remove sysfs entry for memory mapped ring buffers for a channel */
>> +int hv_create_ring_sysfs(struct vmbus_channel *channel);
>> +int hv_remove_ring_sysfs(struct vmbus_channel *channel);
>> +
>> #endif /* _HYPERV_VMBUS_H */
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index 22afebfc28ff..0110643bad3f 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -1802,6 +1802,39 @@ static ssize_t subchannel_id_show(struct vmbus_channel *channel,
>> }
>> static VMBUS_CHAN_ATTR_RO(subchannel_id);
>>
>> +/* Functions to create sysfs interface to allow mmap of the ring buffers.
>> + * The ring buffer is allocated as contiguous memory by vmbus_open
>> + */
>> +static int hv_mmap_ring_buffer(struct vmbus_channel *channel, struct vm_area_struct *vma)
>> +{
>> + void *ring_buffer = page_address(channel->ringbuffer_page);
>> +
>> + if (channel->state != CHANNEL_OPENED_STATE)
>> + return -ENODEV;
>> +
>> + return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
>> + channel->ringbuffer_pagecount << PAGE_SHIFT);
>> +}
>> +
>> +static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
>> + const struct bin_attribute *attr,
>> + struct vm_area_struct *vma)
>> +{
>> + struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj);
>> +
>> + if (!channel->mmap_ring_buffer)
>> + return -ENODEV;
>> + return channel->mmap_ring_buffer(channel, vma);
>
> What is preventing mmap_ring_buffer from being set to NULL right after
> checking it and then calling it here? I see no locks here or where you
> are assigning this variable at all, so what is preventing these types of
> races?
>
> thanks,
>
> greg k-h
Thank you so much for reviewing.
I spent some time to understand if this race condition can happen and it
seems execution flow is pretty sequential, for a particular channel of a
device.
Unless hv_uio_remove (which makes channel->mmap_ring_buffer NULL) can be
called in parallel to hv_uio_probe (which had set
channel->mmap_ring_buffer to non NULL), I doubt race can happen here.
Code Flow: (R, W-> Read, Write to channel->mmap_ring_buffer)
vmbus_device_register
device_register
hv_uio_probe
hv_create_ring_sysfs (W to non NULL)
sysfs_update_group
vmbus_chan_attr_is_visible (R)
vmbus_add_channel_kobj
sysfs_create_group
vmbus_chan_attr_is_visible (R)
hv_mmap_ring_buffer_wrapper (critical section)
hv_uio_remove
hv_remove_ring_sysfs (W to NULL)
sysfs_update_group
vmbus_chan_attr_is_visible (R)
First probe:
hv_uio_probe
hv_create_ring_sysfs (W to non NULL)
sysfs_update_group
vmbus_chan_attr_is_visible (R)
hv_mmap_ring_buffer_wrapper (critical section)
Regards,
Naman
Powered by blists - more mailing lists