lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.0711061419090.8694-100000@iolanthe.rowland.org>
Date:	Tue, 6 Nov 2007 14:49:37 -0500 (EST)
From:	Alan Stern <stern@...land.harvard.edu>
To:	Greg KH <greg@...ah.com>
cc:	Kay Sievers <kay.sievers@...y.org>,
	Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: BUG in: Driver core: convert block from raw kobjects to core
 devices (fwd)

On Mon, 5 Nov 2007, Greg KH wrote:

> On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote:
> > Greg:
> > 
> > So what's our status?  Do you think it's worthwhile adding the 
> > "drop reference to parent kobject at remove time instead of release 
> > time" patch?
> 
> No.
> 
> I still need to take the time and read this thread and find the real
> problem here.  The fact that the issue does not show up for other,
> non-scsi block devices, makes me feel this is a scsi-specific problem
> with how it deals with the driver model, but I need to take the time to
> sit down and figure it out for sure.

Here's the story as far as the SCSI stack goes.  To what extent other 
subsystems have analogous problems, I don't know.

     1. In drivers/scsi/scsi_scan.c, scsi_alloc_sdev() creates a
	scsi_device structure and calls scsi_alloc_queue(), which
	ends up calling blk_init_queue().  As the creator of the
	request_queue, the SCSI core owns the initial reference
	to q->kobj.

     2. This reference is released as part of the scsi_device's
	release routine.  In scsi_sysfs.c,
	scsi_device_dev_release_usercontext() calls scsi_free_queue(),
	which does nothing but call blk_cleanup_queue(), which
	calls blk_put_queue(), which does the final kobject_put()
	on q->kobj.

As a result of 1 and 2, the request_queue isn't released until the
scsi_device is released.

     3. In sd.c, sd_probe() does "gd = alloc_disk()" and it sets
	gd->driverfs_dev to point to the scsi_device's embedded
	struct device (named sdev_gendev).  It then calls add_disk()
	in block/genhd.c, which calls register_disk() in
	fs/partitions/check.c.  register_disk() sets disk->dev.parent
	to disk->driverfs_dev and then calls device_add(&disk->dev).

Setting disk->dev.parent and calling device_add() in this way is new to
Kay's reworking of the driver core.  Previously disk->dev.kobj had been
registered directly, as the gendisk was some sort of class device
rather than a regular device.

Anyway, the upshot of 3 is that sdev->sdev_gendev.kobj is the parent of
disk->dev.kobj, and consequently the scsi_device can't be released
until the gendisk is released.

     4. add_disk() goes on to call blk_register_queue(disk), which sets
	q->kobj.parent to disk->dev.kobj and then calls
	kobject_add(&q->kobj).

As a result of 4, the gendisk can't be released until the request_queue 
is released.

Thus we have a cycle:

	1&2: request_queue isn't released before scsi_device;

	3: scsi_device isn't released before gendisk;

	4: gendisk isn't released before request_queue.

The dependency in 1&2 is hard-coded into the SCSI core.  If I 
understand correctly, the core really does need the request_queue to 
hang around as long as the scsi_device is still present.  According to 
James Bottomley, any block device driver should be expected to have a 
similar requirement.

But the dependencies in 3 and 4 are unnecessary.  They are artifacts,
caused by the fact that a kobject doesn't drop its reference to its
parent until it is released.  If instead the reference to the parent
were dropped when the kobject was removed then 3 and 4 wouldn't apply.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ