lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2025022643-predict-hedge-8c77@gregkh>
Date: Wed, 26 Feb 2025 15:33:02 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Naman Jain <namjain@...ux.microsoft.com>
Cc: "K . Y . Srinivasan" <kys@...rosoft.com>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org,
	stable@...nel.org, Saurabh Sengar <ssengar@...ux.microsoft.com>,
	Michael Kelley <mhklinux@...look.com>,
	Long Li <longli@...rosoft.com>
Subject: Re: [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer

On Wed, Feb 26, 2025 at 05:51:46PM +0530, Naman Jain wrote:
> 
> 
> On 2/26/2025 3:33 PM, Greg Kroah-Hartman wrote:
> > On Wed, Feb 26, 2025 at 10:43:41AM +0530, Naman Jain wrote:
> > > 
> > > 
> > > On 2/25/2025 2:09 PM, Greg Kroah-Hartman wrote:
> > > > On Tue, Feb 25, 2025 at 02:04:43PM +0530, Naman Jain wrote:
> > > > > 
> > > > > 
> > > > > On 2/25/2025 11:42 AM, Greg Kroah-Hartman wrote:
> > > > > > On Tue, Feb 25, 2025 at 10:50:01AM +0530, Naman Jain wrote:
> > > > > > > On regular bootup, devices get registered to vmbus first, so when
> > > > > > > uio_hv_generic driver for a particular device type is probed,
> > > > > > > the device is already initialized and added, so sysfs creation in
> > > > > > > uio_hv_generic probe works fine. However, when device is removed
> > > > > > > and brought back, the channel rescinds and device again gets
> > > > > > > registered to vmbus. However this time, the uio_hv_generic driver is
> > > > > > > already registered to probe for that device and in this case sysfs
> > > > > > > creation is tried before the device gets initialized completely.
> > > > > > > 
> > > > > > > Fix this by moving the core logic of sysfs creation for ring buffer,
> > > > > > > from uio_hv_generic to HyperV's vmbus driver, where rest of the sysfs
> > > > > > > attributes for the channels are defined. While doing that, make use
> > > > > > > of attribute groups and macros, instead of creating sysfs directly,
> > > > > > > to ensure better error handling and code flow.
> 
> <snip>
> 
> > > > > > > +static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
> > > > > > > +				       const struct bin_attribute *attr,
> > > > > > > +				       struct vm_area_struct *vma)
> > > > > > > +{
> > > > > > > +	struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj);
> > > > > > > +
> > > > > > > +	if (!channel->mmap_ring_buffer)
> > > > > > > +		return -ENODEV;
> > > > > > > +	return channel->mmap_ring_buffer(channel, vma);
> > > > > > 
> > > > > > What is preventing mmap_ring_buffer from being set to NULL right after
> > > > > > checking it and then calling it here?  I see no locks here or where you
> > > > > > are assigning this variable at all, so what is preventing these types of
> > > > > > races?
> > > > > > 
> > > > > > thanks,
> > > > > > 
> > > > > > greg k-h
> > > > > 
> > > > > Thank you so much for reviewing.
> > > > > I spent some time to understand if this race condition can happen and it
> > > > > seems execution flow is pretty sequential, for a particular channel of a
> > > > > device.
> > > > > 
> > > > > Unless hv_uio_remove (which makes channel->mmap_ring_buffer NULL) can be
> > > > > called in parallel to hv_uio_probe (which had set
> > > > > channel->mmap_ring_buffer to non NULL), I doubt race can happen here.
> > > > > 
> > > > > Code Flow: (R, W-> Read, Write to channel->mmap_ring_buffer)
> > > > > 
> > > > > vmbus_device_register
> > > > >     device_register
> > > > >       hv_uio_probe
> > > > > 	  hv_create_ring_sysfs (W to non NULL)
> > > > >           sysfs_update_group
> > > > >             vmbus_chan_attr_is_visible (R)
> > > > >     vmbus_add_channel_kobj
> > > > >       sysfs_create_group
> > > > >         vmbus_chan_attr_is_visible  (R)
> > > > >         hv_mmap_ring_buffer_wrapper (critical section)
> > > > > 
> > > > > hv_uio_remove
> > > > >     hv_remove_ring_sysfs (W to NULL)
> > > > 
> > > > Yes, and right in here someone mmaps the file.
> > > > 
> > > > I think you can race here, no locks at all feels wrong.
> > > > 
> > > > Messing with sysfs groups and files like this is rough, and almost never
> > > > a good idea, why can't you just do this all at once with the default
> > > > groups, why is this being added/removed out-of-band?
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > The decision to avoid creating a "ring" sysfs attribute by default
> > > likely stems from a specific use case where it wasn't needed for every
> > > device. By creating it automatically, it keeps the uio_hv_generic
> > > driver simpler and helps prevent potential race conditions. However, it
> > > has an added cost of having ring buffer for all the channels, where it
> > > is not required. I am trying to find if there are any more implications
> > > of it.
> > 
> > You do know about the "is_visable" attribute callback, right?  Why not
> > just use that instead of manually mucking around with the
> > adding/removing of sysfs attributes at all?  That is what it was
> > designed for.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Hi Greg,
> Yes, I am utilizing that in my patch. For differentiating channels of a
> uio_hv_generic device, and for *selectively* creating sysfs, we
> introduced this field in channel struct "channel->mmap_ring_buffer",
> which we were setting only in uio_hv_generic. But, by the time we set
> this in uio_hv_generic driver, the sysfs creation has already gone
> through and sysfs doesn't get updated dynamically. That's where there
> was a need to call sysfs_update_group. I thought the better place to
> keep sysfs_update_group would be in vmbus driver, where we are creating
> the original sysfs entries, hence I had to add the wrapper functions.
> This led us to the race condition we are trying to address now.
> 
> 
> @@ -1838,6 +1872,10 @@ static umode_t vmbus_chan_attr_is_visible(struct
> kobject *kobj,
>  	     attr == &chan_attr_monitor_id.attr))
>  		return 0;
> 
> +	/* Hide ring attribute if channel's mmap_ring_buffer function is not yet
> initialised */
> +	if (attr ==  &chan_attr_ring_buffer.attr && !channel->mmap_ring_buffer)
> +		return 0;
> +
>  	return attr->mode;

Ok, that's good.  BUT you need to change the detection of this to be
before the device is set up in the driver.  Why can't you do it in the
device creation logic itself instead of after-the-fact when you will
ALWAYS race with userspace?

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ