lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <s5hpq61j9qx.wl%tiwai@suse.de>
Date:	Wed, 05 Sep 2012 07:53:42 +0200
From:	Takashi Iwai <tiwai@...e.de>
To:	Ming Lei <ming.lei@...onical.com>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Kay Sievers <kay@...y.org>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: A workaround for request_firmware() stuck in module_init

At Wed, 5 Sep 2012 09:15:34 +0800,
Ming Lei wrote:
> 
> On Wed, Sep 5, 2012 at 12:10 AM, Takashi Iwai <tiwai@...e.de> wrote:
> > At Tue, 4 Sep 2012 23:52:15 +0800,
> > Ming Lei wrote:
> >>
> >> On Tue, Sep 4, 2012 at 9:06 PM, Takashi Iwai <tiwai@...e.de> wrote:
> >> > Hi,
> >> >
> >> > as I've got recently a few bug reports regarding the stuck with
> >> > request_firmware() in module_init of some sound drivers, I started
> >> > looking at the issue.  Strangely, the problem doesn't happen on
> >> > openSUSE 12.2 although it has the same udev version with libkmod as
> >> > Fedora.  So I installed Fedora 17, and indeed I could see a problem
> >> > there.
> >>
> >> It should be a bug in udev, as discussed in the below link:
> >>
> >>           http://marc.info/?t=134552745100002&r=1&w=2
> >
> > Yeah, if udev has a fix, I'm fine.  I'll proactively ignore these bug
> > reports.  But did it really happen...?
> 
> Linus has expressed that it should be fixed by udev, maybe we can
> wait some time to see if it will happen, :-)

Kay, what is the status?

> There are more than 300 request_firmware called in probe(), even
> adding 2 line code in these drivers will involve much workload, since
> you need to find the matched probe()  for one request_firmware and
> sometimes it is not easy.

Well, this depends on from which perspective you look at the issue.
See below.

> >> > Obviously a solution would be to rewrite the driver code to use
> >> > request_firmware_nowait() instead.  But it'd need a lot of code
> >> > shuffling, and most of such drivers are old stuff I don't want to do a
> >> > serious surgery.
> >> >
> >> > So I tried an easier workaround by using the deferred probing.
> >> > An experimental patch is below.  As you can see, from the driver side,
> >> > it's simple: just add two lines at the head of each probe function.
> >>
> >> In fact, the defer probe should only be applied in situations which
> >> driver is built in kernel and request_firmware is called in .probe().
> >
> > Is it?  I thought the deferred probe is basically not for this problem
> > but rather for the dependency problem with other modules.  That's the
> > reason it's triggered only upon the successful binding.
> 
> Sorry, could you explain the dependency in a bit detail?

When a device has some implicit dependency on another hardware
component (e.g. SDHCI depends on I2C GPIO controller, as in comment in
drivers/base/dd.c), the driver needs to wait until another one gets
ready.  -EPROBE_DEFER was introduced for such a purpose.

So, using this mechanism for the firmware issue is a kind of abuse.

> >From your patch, I understand you just convert the caller of
> request_firmware from module_init into deferred_probe_work_func(),
> so the udev problem is avoided, isn't it?

Yes, that was a hack.  The idea is just offloading the probe, and the
deferred probe is an existing simple way for that.

> Looks making all .probe() run non-module_init context is still a solution, :-)

Sounds interesting.

> > And IMO, the deferred probe for the built-in kernel is also silly.
> > Did anyone really make it working for built-in kernel driver and
> > external firmware files?  (For the resume, it's of course a different
> 
> Yes, my original patch does work for the built-in situations.
> 
> > issue.  And I guess it's been solved by your fw cache patch, right?)
> >
> >> > Do you think this kind of hack is OK?  If not, any better (IOW easier)
> >> > solution?
> >>
> >> Looks your solution is a bit complicated, and I have wrote a similar
> >> patch to address the problem, but Linus thought that it may hide the
> >> problem of drivers.
> >>
> >>         http://marc.info/?t=134278790800004&r=1&w=2
> >>
> >> IMO, driver core needn't to be changed, and the defer probe can be
> >> supported just by changes in request_firmware() and its caller.
> >
> > Ideally, yes.  But it won't work in some drivers like sound drivers,
> > that have its own device number counting changed at each probe call.
> > For such drivers, the deferred probe check must be done before
> > entering the normal probe procedure, and returning -EPROBE_DEFER would
> > be too late (or need more complex fallbacks).
> 
> Simply, you can put the firmware loading at the start of probe to
> avoid the specific
> sound problem, :-)

Well, I can't buy it.  Changing the call order can be often more
problematic.  Anyway...


IMO, the primary question is whether we still regard the
request_firmware() call in module_init as a driver bug.  Or, looking
at the usage numbers, we should accept it as a de facto standard use
case?

If we still see it as a bug, the only right way is to fix the drivers,
not the core.  That's why I suggested to put some fix to each driver.
In that way, it shows obvious that the driver is a kind of buggy but
fixed in a very lazy way (the function name should have been more
obvious one like i_may_call_request_firmware_so_call_me_later()).
The difference is that this fixes the bug, not hiding the problem.

OTOH, if we see it no longer as a bug, we should rather improve the
handling in the kernel core.  (Or, more radically, we can consider a
sort of async probing as default :) 
In anyway, fixing the core side is justified only if we agree that
request_firmware() in module_init is a valid use case.

So, before starting the work, we actually need a consensus.


thanks,

Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ