lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <960501c2-5ab2-4c81-86ac-a4477c0f708a@linux.microsoft.com>
Date: Thu, 27 Feb 2025 11:54:23 +0530
From: Naman Jain <namjain@...ux.microsoft.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: "K . Y . Srinivasan" <kys@...rosoft.com>,
 Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
 Dexuan Cui <decui@...rosoft.com>,
 Stephen Hemminger <stephen@...workplumber.org>,
 linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org,
 stable@...nel.org, Saurabh Sengar <ssengar@...ux.microsoft.com>,
 Michael Kelley <mhklinux@...look.com>, Long Li <longli@...rosoft.com>
Subject: Re: [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer



On 2/26/2025 8:03 PM, Greg Kroah-Hartman wrote:
> On Wed, Feb 26, 2025 at 05:51:46PM +0530, Naman Jain wrote:
>>
>>
>> On 2/26/2025 3:33 PM, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 26, 2025 at 10:43:41AM +0530, Naman Jain wrote:
>>>>
>>>>
>>>> On 2/25/2025 2:09 PM, Greg Kroah-Hartman wrote:
>>>>> On Tue, Feb 25, 2025 at 02:04:43PM +0530, Naman Jain wrote:
>>>>>>
>>>>>>
>>>>>> On 2/25/2025 11:42 AM, Greg Kroah-Hartman wrote:
>>>>>>> On Tue, Feb 25, 2025 at 10:50:01AM +0530, Naman Jain wrote:
>>>>>>>> On regular bootup, devices get registered to vmbus first, so when
>>>>>>>> uio_hv_generic driver for a particular device type is probed,
>>>>>>>> the device is already initialized and added, so sysfs creation in
>>>>>>>> uio_hv_generic probe works fine. However, when device is removed
>>>>>>>> and brought back, the channel rescinds and device again gets
>>>>>>>> registered to vmbus. However this time, the uio_hv_generic driver is
>>>>>>>> already registered to probe for that device and in this case sysfs
>>>>>>>> creation is tried before the device gets initialized completely.
>>>>>>>>
>>>>>>>> Fix this by moving the core logic of sysfs creation for ring buffer,
>>>>>>>> from uio_hv_generic to HyperV's vmbus driver, where rest of the sysfs
>>>>>>>> attributes for the channels are defined. While doing that, make use
>>>>>>>> of attribute groups and macros, instead of creating sysfs directly,
>>>>>>>> to ensure better error handling and code flow.
>>
>> <snip>
>>
>>>>>>>> +static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
>>>>>>>> +				       const struct bin_attribute *attr,
>>>>>>>> +				       struct vm_area_struct *vma)
>>>>>>>> +{
>>>>>>>> +	struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj);
>>>>>>>> +
>>>>>>>> +	if (!channel->mmap_ring_buffer)
>>>>>>>> +		return -ENODEV;
>>>>>>>> +	return channel->mmap_ring_buffer(channel, vma);
>>>>>>>
>>>>>>> What is preventing mmap_ring_buffer from being set to NULL right after
>>>>>>> checking it and then calling it here?  I see no locks here or where you
>>>>>>> are assigning this variable at all, so what is preventing these types of
>>>>>>> races?
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> greg k-h
>>>>>>
>>>>>> Thank you so much for reviewing.
>>>>>> I spent some time to understand if this race condition can happen and it
>>>>>> seems execution flow is pretty sequential, for a particular channel of a
>>>>>> device.
>>>>>>
>>>>>> Unless hv_uio_remove (which makes channel->mmap_ring_buffer NULL) can be
>>>>>> called in parallel to hv_uio_probe (which had set
>>>>>> channel->mmap_ring_buffer to non NULL), I doubt race can happen here.
>>>>>>
>>>>>> Code Flow: (R, W-> Read, Write to channel->mmap_ring_buffer)
>>>>>>
>>>>>> vmbus_device_register
>>>>>>      device_register
>>>>>>        hv_uio_probe
>>>>>> 	  hv_create_ring_sysfs (W to non NULL)
>>>>>>            sysfs_update_group
>>>>>>              vmbus_chan_attr_is_visible (R)
>>>>>>      vmbus_add_channel_kobj
>>>>>>        sysfs_create_group
>>>>>>          vmbus_chan_attr_is_visible  (R)
>>>>>>          hv_mmap_ring_buffer_wrapper (critical section)
>>>>>>
>>>>>> hv_uio_remove
>>>>>>      hv_remove_ring_sysfs (W to NULL)
>>>>>
>>>>> Yes, and right in here someone mmaps the file.
>>>>>
>>>>> I think you can race here, no locks at all feels wrong.
>>>>>
>>>>> Messing with sysfs groups and files like this is rough, and almost never
>>>>> a good idea, why can't you just do this all at once with the default
>>>>> groups, why is this being added/removed out-of-band?
>>>>>
>>>>> thanks,
>>>>>
>>>>> greg k-h
>>>>
>>>> The decision to avoid creating a "ring" sysfs attribute by default
>>>> likely stems from a specific use case where it wasn't needed for every
>>>> device. By creating it automatically, it keeps the uio_hv_generic
>>>> driver simpler and helps prevent potential race conditions. However, it
>>>> has an added cost of having ring buffer for all the channels, where it
>>>> is not required. I am trying to find if there are any more implications
>>>> of it.
>>>
>>> You do know about the "is_visable" attribute callback, right?  Why not
>>> just use that instead of manually mucking around with the
>>> adding/removing of sysfs attributes at all?  That is what it was
>>> designed for.
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> Hi Greg,
>> Yes, I am utilizing that in my patch. For differentiating channels of a
>> uio_hv_generic device, and for *selectively* creating sysfs, we
>> introduced this field in channel struct "channel->mmap_ring_buffer",
>> which we were setting only in uio_hv_generic. But, by the time we set
>> this in uio_hv_generic driver, the sysfs creation has already gone
>> through and sysfs doesn't get updated dynamically. That's where there
>> was a need to call sysfs_update_group. I thought the better place to
>> keep sysfs_update_group would be in vmbus driver, where we are creating
>> the original sysfs entries, hence I had to add the wrapper functions.
>> This led us to the race condition we are trying to address now.
>>
>>
>> @@ -1838,6 +1872,10 @@ static umode_t vmbus_chan_attr_is_visible(struct
>> kobject *kobj,
>>   	     attr == &chan_attr_monitor_id.attr))
>>   		return 0;
>>
>> +	/* Hide ring attribute if channel's mmap_ring_buffer function is not yet
>> initialised */
>> +	if (attr ==  &chan_attr_ring_buffer.attr && !channel->mmap_ring_buffer)
>> +		return 0;
>> +
>>   	return attr->mode;
> 
> Ok, that's good.  BUT you need to change the detection of this to be
> before the device is set up in the driver.  Why can't you do it in the
> device creation logic itself instead of after-the-fact when you will
> ALWAYS race with userspace?
> 
> thanks,
> 
> greg k-h

Sure, will check more on this. Thanks.

Regards,
Naman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ