lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081216212150.GA20721@kroah.com>
Date:	Tue, 16 Dec 2008 13:21:50 -0800
From:	Greg KH <greg@...ah.com>
To:	Hans Verkuil <hverkuil@...all.nl>
Cc:	linux-kernel@...r.kernel.org, v4l <video4linux-list@...hat.com>,
	Laurent Pinchart <laurent.pinchart@...net.be>
Subject: Re: [BUG] cdev_put() race condition

On Tue, Dec 16, 2008 at 10:00:51PM +0100, Hans Verkuil wrote:
> On Tuesday 16 December 2008 21:22:48 Greg KH wrote:
> > On Mon, Dec 08, 2008 at 09:56:26PM +0100, Hans Verkuil wrote:
> > > Hi Greg,
> > > 
> > > Laurent found a race condition in the uvc driver that we traced to the 
> > > way chrdev_open and cdev_put/get work.
> > > 
> > > You need the following ingredients to reproduce it:
> > > 
> > > 1) a hot-pluggable char device like an USB webcam.
> > > 2) a manually created device node for such a webcam instead of relying 
> > > on udev.
> > > 
> > > In order to easily force this situation you would also need to add a  
> > > delay to the char device's release() function. For webcams that would 
> > > be at the top of v4l2_chardev_release() in 
> > > drivers/media/video/v4l2-dev.c. But adding a delay to e.g. cdev_purge 
> > > would have the same effect.
> > > 
> > > The sequence of events in the case of a webcam is as follows:
> > > 
> > > 1) The USB device is removed, causing a disconnect.
> > > 
> > > 2) The webcam driver unregisters the video device which in turn calls 
> > > cdev_del().
> > > 
> > > 3) When the last application using the device is closed, the cdev is 
> > > released when the kref of the cdev's kobject goes to 0.
> > > 
> > > 4) If the kref's release() call takes a while due to e.g. extra cleanup 
> > > in the case of a webcam, then another application can try to open the 
> > > video device. Note that this requires a device node created with mknod, 
> > > otherwise the device nodes would already have been removed by udev.
> > > 
> > > 5) chrdev_open checks inode->i_cdev. If this is NULL (i.e. this device 
> > > node was never accessed before), then all is fine since kobj_lookup 
> > > will fail because cdev_del() has been called earlier. However, if this 
> > > device node was used earlier, then the else part is called: 
> > > cdev_get(p). This 'p' is the cdev that is being released. Since the 
> > > kref count is 0 you will get a WARN message from kref_get, but the code 
> > > continues on, the f_op->open will (hopefully) return more-or-less 
> > > gracefully with an error and the cdev_put at the end will cause the 
> > > refcount to go to 0 again, which results in a SECOND call to the kref's 
> > > release function!
> > > 
> > > See this link for the original discussion on the v4l list containing 
> > > stack traces an a patch that you need if you want to (and can) test 
> > > this with the uvc driver:
> > > 
> > > http://www.spinics.net/lists/vfl/msg39967.html
> > 
> > The second sentence in that message shows your problem here:
> > 	To avoid the need of a reference count in every v4l2 driver,
> > 	v4l2 moved to cdev which includes its own reference counting
> > 	infrastructure based on kobject.
> > 
> > cdev is not ment to handle the reference counting of any object outside
> > of itself, and should never be embedded within anything.  I've been
> > thinking of taking the real "kobject" out of that structure for a long
> > time now, incase someone did something foolish like this.
> > 
> > Seems I was too late :(
> > 
> > So, to solve this, just remove the reliance on struct cdev in your own
> > structures, you don't want to do this for the very reason you have now
> > found (and for others, like the fact that this isn't a "real" struct
> > kobject in play here, just a fake one.)
> > 
> > Ick, what a mess.
> 
> Sorry, but this makes no sense. First of all the race condition exists
> regardless of how v4l uses it. Other drivers using cdev with a
> hot-pluggable device in combination with a manually created device
> node should show the same problem.

I don't see how this would be, unless this problem has always existed in
the cdev code, it was created back before dynamic device nodes with udev
existed :)

> It's just that we found it with v4l because the release callback takes
> longer than usual, thus increasing the chances of hitting the race.

The release callback for the cdev itself?

> The core problem is simply that it is possible to call cdev_get while
> in cdev_put! That should never happen.

True, and cdev_put is only called from __fput() which has the proper
locking to handle the call to cdev_get() if a char device is opened,
right?

> Secondly, why shouldn't struct cdev be embedded in anything? It's used
> in lots of drivers that way. I really don't see what's unusual or
> messy about v4l in that respect.

I don't see it embedded in any other structures at the moment, and used
to reference count the structure, do you have pointers to where it is?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ