[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTimdV1WtdPXeZ8JO40gkC=2dt27bqKxGORuHVyrn@mail.gmail.com>
Date: Thu, 6 Jan 2011 17:41:39 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Chris Ball <cjb@...top.org>
Cc: Nick Piggin <npiggin@...il.com>,
Jongman Heo <jongman.heo@...il.com>,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [announce] vfs-scale git tree update
On Thu, Jan 6, 2011 at 4:59 PM, Chris Ball <cjb@...top.org> wrote:
>
> In my case, the hang happens when microcode.ko is modprobed and calls
> out for device firmware via request_firmware(), and then udev also calls
> microcode_ctl, which attempts to open(2) /dev/cpu/microcode to write
> microcode into it. (The request_firmware() interface is the preferred
> one, and opening /dev/cpu/microcode is an older compatibility interface.)
Hmm. That modprobe seems to be hung on 'sysdev_drivers_lock'.
Which in turn seems to be _held_ by the first modprobe, which is
waiting for a request_firmware:
[ 256.980052] modprobe D 00000000ffff4f88 0 372 1
0x00000000
[ 256.981227] ffff88022206dc58 0000000000000086 0000000000000292
00000000ffffffff
[ 256.982415] 0000000000013840 0000000000013840 0000000000013840
ffff88022620dc40
[ 256.983692] 0000000000013840 ffff88022206dfd8 0000000000013840
0000000000013840
[ 256.984979] Call Trace:
[ 256.986306] [<ffffffff81463a41>] schedule_timeout+0x36/0xe3
[ 256.987615] [<ffffffff8110ad4c>] ? kfree+0xc9/0xd6
[ 256.988893] [<ffffffff8103d243>] ? need_resched+0x23/0x2d
[ 256.990337] [<ffffffff81463824>] wait_for_common+0xad/0x102
[ 256.991637] [<ffffffff8104757f>] ? default_wake_function+0x0/0x14
[ 256.992954] [<ffffffff81463931>] wait_for_completion+0x1d/0x1f
[ 256.994360] [<ffffffff812f42df>] _request_firmware+0x2df/0x39a
[ 256.999744] [<ffffffffa00f6358>] microcode_init_cpu+0xc4/0x115 [microcode]
[ 257.001112] [<ffffffffa00f6409>] mc_sysdev_add+0x60/0x76 [microcode]
[ 257.002458] [<ffffffff812e9772>] sysdev_driver_register+0xc0/0x11b
and everybody else is in the open path for the microcode. And that
request_firmware holds the lock, because it's done through the ->add()
function of another sysdev_driver_register().
I'm wondering if this is a previously existing race condition leading
to a deadlock. One that previously would have been serialized enough
by the dcache lock that you'd never have that happen.
It might be interesting to re-run it with mutex debugging and lockdep
enabled, to see if that reports anything. Although it probably won't,
because it's not about a plain lock dependency, but ends up being
deadlocked on the uevent being finished (but you have the modprobe and
the request_firmware ones waiting on each other).
I dunno. I haven't really though that fully through. But we've had
cases roughly like that before, and yes, they can be exposed by some
independent serialization going away - long-standing potential bugs,
that simply never happened in practice before.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists