lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 Mar 2009 10:50:10 -0600
From:	Alex Chiang <achiang@...com>
To:	Greg KH <greg@...ah.com>
Cc:	kay.sievers@...y.org, rjw@...k.pl, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org
Subject: Re: kobj refcounting weirdness

* Greg KH <greg@...ah.com>:
> On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> > Hi Kay, Greg,
> > 
> > I've been working on this patch series recently that adds
> > function and device level hotplug into the PCI core:
> > 
> > 	http://thread.gmane.org/gmane.linux.kernel.pci/3495
> > 
> > For the last two weeks, I've been beating my head against a
> > refcounting/kobject problem, and was hoping you could give me
> > some advice, since I seem to have run into a wall.
> > 
> > My test case has been removing device 0000:04:00.0, which should
> > remove all the devices below it.
> 
> You are removing the children before the parent device, right?  If not,
> you have to be _very_ careful (personally, I don't think you should be
> allowed to do that, but others, like the scsi developers, like doing
> things like this...)

Yes, I'm removing children before the parent, using the
pci_remove_bus_device() interface.

> > In this data set, I turned on kobject debugging, and managed to
> > capture a trace where we die on the 2nd rescan.
> > 
> > In this data set, we:
> > 
> > 	- create a kobject for 0000:04:00.0 (e00000018cac2920)
> > 	- remove the device
> > 	- observe '0000:04:00.0' (e00000018cac2920): calling ktype release
> > 	- rescan the bus
> > 	- discover that e00000018cac2920 is still hanging around!
> 
> What do you mean by "rescan"?  

By rescan, I mean we're rescanning the entire PCI bus, looking
for new devices.

	for each PCI root bus:
		pci_scan_child_bus()
		pci_bus_add_devices()

> And sure, if you create a new device, it could be allocated at
> the same location, that's what the slab allocators do, right?

I thought about the allocators returning a pointer to the same
location that maybe has some valid looking data hanging around,
but it's not wise for someone like me to go pointing fingers at
the allocator before I've proven the bug isn't in my code. ;)

I'm just hoping for some advice on what else I could instrument
to try and track this down further.

Thanks.

/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ