[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140604032533.GA22469@roeck-us.net>
Date: Tue, 3 Jun 2014 20:25:33 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: Francesco Ruggeri <fruggeri@...stanetworks.com>,
linux-kernel@...r.kernel.org, hare@...e.de, fruggeri@...sta.com
Subject: Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 04:21:00PM -0700, Greg KH wrote:
> On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote:
> > In-Reply-To: <20140523023141.GC13900@...ah.com>
> >
> >
> > Hi Guenter,
> > I got back to looking into this crash.
> > Just as an example, the attached diffs also fix my bus_find_device problem for
> > traversals that start from the head of the list and traverse it completely.
> > They are very specific to the case of bus_find_device, and a complete solution
> > would affect a lot of code.
> > The main issue seems to be that when a device is found in a klist by say
> > bus_find_device the klist_node reference should be returned to the caller,
> > who should then decide whether to use it for the next klist search, drop it or
> > maybe exchange it for a struct device reference. When resuming a search one
> > should already hold a klist_node reference from the previous search.
> > This model is broken by several functions using struct devices such as
> > bus_find_device, which resume klist searches on the implicit assumption that
> > holding a reference to the struct device is enough to acquire one on the
> > klist_node.
> > The only reason that this has not been a big issue so far is probably that
> > on most systems struct devices are not destroyed and created very often.
>
> Not true, this happens on every USB device insertion and removal, and on
> startup and shutdown. What makes PCI special that we aren't hitting
> these issues in USB and other subsystems that do a lot of device
> creation/removal?
>
Look for callers of bus_find_device. Unless I am missing something, only pci
and scsi code call it with non-NULL 'start' argument, and the scsi use is
limited to a walk through scsi devices for a proc file.
Makes me wonder if the start argument should go away, and if pci and scsi
should use another means to walk through devices.
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists