lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 18 Dec 2008 10:25:46 +0200
From:	Boaz Harrosh <bharrosh@...asas.com>
To:	Hans Verkuil <hverkuil@...all.nl>
CC:	Greg KH <greg@...ah.com>, linux-kernel@...r.kernel.org,
	v4l <video4linux-list@...hat.com>,
	Laurent Pinchart <laurent.pinchart@...net.be>,
	Tejun Heo <tj@...nel.org>
Subject: Re: [BUG] cdev_put() race condition

Hans Verkuil wrote:
>>>>> delayin
>>>>> delayout
>>>>> delayout
>> OK Now I do not understand something. How is the double release possible?
>> I guess it is something to do with the complicated chrdev_open() in its
>> inode->i_cdev == NULL case. But still isn't the kref inside kobject
>> suppose to protect me from exactly that?
> 
> Only if there is also proper locking to prevent a kref_get from being called 
> when the release of a kref_put is in progress. And that's missing in cdev.
> 
> Typical scenario: a USB device is disconnected, the driver sees that no 
> applications are using it, then it calls cdev_del. This calls cdev_put, the 
> cdev's refcount goes to 0 and it will call release. BUT, at this moment an 
> application can open the device *again* and chrdev_open() will reuse the 
> i_cdev pointer and call cdev_get(p) on it, and this increases the refcount 
> from 0 to 1 again. And later it calls cdev_put again and release is called 
> a second time.

OK I see. 

> 
>>>>> Note the duplicate 'delayin' messages. Also note that the cdev struct
>>>>> was allocated by sg.c, so the second cdev cleanup will likely poke
>>>>> into already freed memory.
>>>>>

>> I had a similar problem and solved it in a way that I think should be
>> safe. I'll try and run your tests to make sure. Here is a short
>> description of my solution:
>>
>> MyDevice {
>>    ...
>>    embed_cdev;
>>    embed_kref;
>>    ...
>> };
>>
>> OnHotPlug() {
>> 	...
>> 	kref_init(embed_kref); // kref==1
>> 	cdev_add(embed_cdev);  // cdev==1
>> 	get_(embed_cdev);      // cdev++==2
>> }
>>
>> OnFileOpen() { // kernel does // cdev++==n > 2
>> 	kref_get(embed_kref); // kref++==n > 1
>> }
>>
>> OnFileClose() {
>> 	kref_put(embed_kref, __release); // kref++==n >= 0
>> } // kernel does // cdev--==n >= 1
>>
>> OnHotRemove() {
>> 	cdev_del(embed_cdev);  // cdev--==1
>> 	kref_put(embed_kref, __release); // kref--==n >= 0
>> }
>>
>> __release() { // by definition kref==0
>> 	put_(embed_cdev);      // cdev--==0
>> }
>>
>> What do you think?
> 
> This won't help either. __release calls cdev_put, the cdev refcount goes to 
> 0, the cdev release is called, but at this time someone can open the device 
> again. Refcounting in the driver simply won't help since chrdev_open is 
> always called before the driver has a chance to check anything.
> 

No, But at this point cdev_del has already been called before the final put,
so if a chrdev_open is called while in cdev_release it will not find my
device anymore. I have separated the unmapping of the device from it's final put.

> In addition, I think it is nuts to introduce a kref that just shadows the 
> cdev's kref. We should be able to rely on cdev for this.
> 

I agree, that's why I like Tejun's patch because in theory we can get
rid of the shadow kref. (Without a layering violation)

> I expect that when you test your driver with this you should hit the same 
> race condition. Remember that you need to mknod the device. If you rely on 
> udev then this will never happen because udev has removed the device node 
> before cdev_del is called. (At least, I think that is always true. I'm no 
> expert on this.)
> 

OK I see, I do use udev in all my fedora images. I will try to disable it
and test. So far I was unable to reproduce the problem with my device.

>> BTW sg does not use kref and I suspect it might be racy by its own.
> 
> It might, but it has nothing to do with this bug.
> 
> Regards,
> 
> 	Hans
> 

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ