linux-kernel - Re: [RFC/PATCH] Multithread initcalls to auto-resolve ordering issues.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Wed, 11 Jan 2012 23:45:46 -0700
From:	Grant Likely <grant.likely@...retlab.ca>
To:	Mark Brown <broonie@...nsource.wolfsonmicro.com>
Cc:	NeilBrown <neilb@...e.de>, MyungJoo Ham <myungjoo.ham@...sung.com>,
	Randy Dunlap <rdunlap@...otime.net>,
	Mike Lockwood <lockwood@...roid.com>,
	Arve Hjønnevåg <arve@...roid.com>,
	Kyungmin Park <kyungmin.park@...sung.com>,
	Donggeun Kim <dg77.kim@...sung.com>, Greg KH <gregkh@...e.de>,
	Arnd Bergmann <arnd@...db.de>,
	MyungJoo Ham <myungjoo.ham@...il.com>,
	Linus Walleij <linus.walleij@...aro.org>,
	Dmitry Torokhov <dmitry.torokhov@...il.com>,
	Morten CHRISTIANSEN <morten.christiansen@...ricsson.com>,
	Liam Girdwood <lrg@...com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC/PATCH] Multithread initcalls to auto-resolve ordering issues.

On Mon, Jan 9, 2012 at 1:08 AM, Mark Brown
<broonie@...nsource.wolfsonmicro.com> wrote:
> On Mon, Jan 09, 2012 at 06:28:00PM +1100, NeilBrown wrote:
>> On Sun, 8 Jan 2012 22:22:31 -0800 Mark Brown
>> > On Mon, Jan 09, 2012 at 04:10:58PM +1100, NeilBrown wrote:
>
>> > So, my general inclination is that given the choice between parallel and
>> > serial solutions I'll prefer the serial solution on the basis that it's
>> > most likely going to be easier to think about and less prone to getting
>> > messed up.
>
>> Surely anyone doing kernel work needs to be able to understand parallel
>> solutions at least enough to place locks in appropriate places ???
>
> You'd expect people to be able to work it out but there's no sense in
> doing something hard if something easy works just as well - concurrency
> can bring problems with things like reproducibility which make life
> harder than it might otherwise be.

+1

>> I thought about doing a serial retry solution the error from the ->probe
>> function doesn't percolate all the way up the the initcall.
>> In particular, when a driver is registered driver_attach is called for each
>> unattached device on the bus.  This is done in __driver_attach which discard
>> the error return from driver_probe_device().
>
> There's code for doing the retries floating around, Grant Likely was
> working on it initially then someone from Linaro picked it up and I'm
> not sure what happened.

I'm going to pick up the patch again next week and get it ready for
possible 3.4 merging.  I've gotten a lot of requests to get this work
finished.  The latest posted version of the patch can be found here[1]
and the lwn article is here[2]:

[1]https://lkml.org/lkml/2011/10/7/17
[2]http://lwn.net/Articles/450460/

I (obviously) prefer the deferred probe approach over using threaded
initcalls.  I originally did look at doing exactly what is proposed
here, but I didn't like that it required each subsystem to be
explicitly modified to provide blocking request calls, and I also
discovered that it's been tried and failed several times before.  It
appears that there are a lot of undeclared dependencies and
concurrency issues between device drivers that are pretty much
impossible to track down.

I like the deferred probe approach because it is conceptually simple,
it is minimally invasive, and it works for all subsystems without
needing to implement blocking infrastructure.

As far as the device tree aspects go, it is true that whether or not a
resource will exist is described by device tree data.  However, the
best place to interpret that data is not with the resource provider
driver, but with the resource consumer because the consumer's driver
understands how resources are bound for that specific device. (a
provider node generally doesn't have any information about or way to
determine which consumer nodes will be using it).

[...]
>> Is single-threading really worth all the churn deep inside the drivers/base
>> code that is would probably require?
>
> I don't see why it'd require much churn to be honest - the patches that
> I looked at weren't that invasive, basically just shove devices that
> fail with a particular code into a retry list and iterate through it
> whenever it seems useful to do so.

There is very little churn.  The driver model already did /almost/
everything that was needed.  I pretty much just needed to add the list
and an iterator for it.

g.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/