lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 May 2014 10:57:00 -0700
From:	Guenter Roeck <linux@...ck-us.net>
To:	Francesco Ruggeri <fruggeri@...sta.com>
Cc:	Greg Kroah-Hartmann <gregkh@...uxfoundation.org>,
	Hannes Reinecke <hare@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: pci: kernel crash in bus_find_device

On Thu, May 22, 2014 at 09:19:40AM -0700, Francesco Ruggeri wrote:
> Aborting a search does not sound like a correct solution.
> How does a higher level user (eg for_each_pci_dev) know that a search
> was aborted and decide whether it should try again, assuming it would
> be ok repeating the action on the devices visited the first time?
> 
Agreed, it is less than desirable.

I would consider this to be a secondary problem, though, the immediate
problem being the crash. One possible solution might be to have the various
functions return error codes (ERR_PTR), but that would be quite invasive as
well. I really think we need input from Greg and, if the solution touches
the PCI subsystem, from Bjorn Helgaas to find an acceptable solution
to that problem.

Guenter

> Francesco
> 
> 
> On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck <linux@...ck-us.net> wrote:
> > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
> >>
> >> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
> >>>
> >>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> >>>>
> >>>> I have been using an x86 platform.
> >>>> When I started working on it I got early crashes until I added the
> >>>> check for p not NULL in
> >>>>
> >>>> +void bus_release_device(struct device *dev)
> >>>> +{
> >>>> + struct device_private *p = dev->p;
> >>>> +
> >>>> + if (p && klist_node_attached(&p->knode_bus))
> >>>> + klist_put_last(&p->knode_bus);
> >>>> +}
> >>>> +
> >>>>
> >>>> Maybe on powerpc *p is overriden between device_del and device_release?
> >>>>
> >>>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> >>>> treated as WARN_ONs in the current klist code.
> >>>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> >>>> ran into it without the second patch (but only when I ran my module
> >>>> and tests).
> >>>>
> >>> Hi Francesco,
> >>>
> >>> I replaced the BUG_ON with WARN_ON; still crashes.
> >>>
> >>> Anyway, the problem seems to be known. I found two related exchanges.
> >>>
> >>> [1] describes pretty much the same problem. I don't see if/where it was
> >>> ever fixed, though.
> >>>
> >>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
> >>> so I had to make some adjustments in klist_iter_init_node. Resulting
> >>> patch is below. With this patch, the problem is gone. It is not perfect,
> >>> as it aborts the loop if it encounters a deleted kobject, but it is
> >>> better
> >>> than nothing. Unfortunately, the patch never made it upstream; no idea
> >>> why.
> >>> Copying the author and Greg to get additional feedback.
> >>>
> >>> Guenter
> >>>
> >>> [1] https://lkml.org/lkml/2008/10/26/79
> >>> [2] https://lkml.org/lkml/2012/4/16/218
> >>
> >>
> >> 2 years ago?  I have no idea what was up with that, sorry...
> >>
> >
> > Ok, but do you have comments on the patch itself in its current version ?
> >
> > Guenter
> >
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists