[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090309165010.GB32589@ldl.fc.hp.com>
Date: Mon, 9 Mar 2009 10:50:10 -0600
From: Alex Chiang <achiang@...com>
To: Greg KH <greg@...ah.com>
Cc: kay.sievers@...y.org, rjw@...k.pl, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org
Subject: Re: kobj refcounting weirdness
* Greg KH <greg@...ah.com>:
> On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> > Hi Kay, Greg,
> >
> > I've been working on this patch series recently that adds
> > function and device level hotplug into the PCI core:
> >
> > http://thread.gmane.org/gmane.linux.kernel.pci/3495
> >
> > For the last two weeks, I've been beating my head against a
> > refcounting/kobject problem, and was hoping you could give me
> > some advice, since I seem to have run into a wall.
> >
> > My test case has been removing device 0000:04:00.0, which should
> > remove all the devices below it.
>
> You are removing the children before the parent device, right? If not,
> you have to be _very_ careful (personally, I don't think you should be
> allowed to do that, but others, like the scsi developers, like doing
> things like this...)
Yes, I'm removing children before the parent, using the
pci_remove_bus_device() interface.
> > In this data set, I turned on kobject debugging, and managed to
> > capture a trace where we die on the 2nd rescan.
> >
> > In this data set, we:
> >
> > - create a kobject for 0000:04:00.0 (e00000018cac2920)
> > - remove the device
> > - observe '0000:04:00.0' (e00000018cac2920): calling ktype release
> > - rescan the bus
> > - discover that e00000018cac2920 is still hanging around!
>
> What do you mean by "rescan"?
By rescan, I mean we're rescanning the entire PCI bus, looking
for new devices.
for each PCI root bus:
pci_scan_child_bus()
pci_bus_add_devices()
> And sure, if you create a new device, it could be allocated at
> the same location, that's what the slab allocators do, right?
I thought about the allocators returning a pointer to the same
location that maybe has some valid looking data hanging around,
but it's not wise for someone like me to go pointing fingers at
the allocator before I've proven the bug isn't in my code. ;)
I'm just hoping for some advice on what else I could instrument
to try and track this down further.
Thanks.
/ac
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists