lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 08 Mar 2008 08:03:23 -0600
From:	Frank Sorenson <frank@...rocks.com>
To:	Matt Domsch <Matt_Domsch@...l.com>
CC:	Ingo Molnar <mingo@...e.hu>, kay.sievers@...y.org,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
	"Rafael J. Wysocki" <rjw@...k.pl>, jcm@...hat.com
Subject: Re: 2.6.25-rc4 OOMs itself dead on bootup (modprobe bug?)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matt Domsch wrote:
> On Sat, Mar 08, 2008 at 01:08:46AM -0600, Frank Sorenson wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Frank Sorenson wrote:
>>> I did some additional debugging, and I believe you're correct about it
>>> being specific to my system.  The system seems to run fine until some
>>> time during the boot.  I booted with "init=/bin/sh" (that's how the
>>> system stayed up for 9 minutes), then it died when I tried starting
>>> things up.  I've further narrowed the OOM down to udev (though it's not
>>> entirely udev's fault, since 2.6.24 runs fine).
>>>
>>> I ran your debug info tool before killing the box by running
>>> /sbin/start_udev.  The output of the tool is at
>>> http://tuxrocks.com/tmp/cfs-debug-info-2008.03.06-14.11.24
>>>
>>> Something is apparently happening between 2.6.24 and 2.6.25-rc[34] which
>>> causes udev (or something it calls) to behave very badly.
>> Found it.  The culprit is 8f47f0b688bba7642dac4e979896e4692177670b
>>     dcdbas: add DMI-based module autloading
>>
>>     DMI autoload dcdbas on all Dell systems.
>>
>>     This looks for BIOS Vendor or System Vendor == Dell, so this should
>>     work for systems both Dell-branded and those Dell builds but brands
>>     for others.  It causes udev to load the dcdbas module at startup,
>>     which is used by tools called by HAL for wireless control and
>>     backlight control, among other uses.
>>
>> What actually happens is that when udev loads the dcdbas module at
>> startup, modprobe apparently calls "modprobe dcdbas" itself, repeating
>> until the system runs out of resources (in this case, it OOMs).
>>
>> # ps axf
>> ...
>>   506 ?        S      0:00 /bin/bash /sbin/start_udev
>>   590 ?        S      0:00  \_ /sbin/udevsettle
>>   533 ?        S<s    0:00 /sbin/udevd -d
>>   629 ?        S<     0:00  \_ /sbin/udevd -d
>>   630 ?        S<     0:00  |   \_ /sbin/modprobe
>> dmi:bvnDellInc.:bvrA08:bd04/02/2007:svnDellInc.:pnMP061:pvr:rvnDellInc.:rn0YD479:rvr:cvnDellInc.:ct8:cvr:
>>   949 ?        S<     0:00  |       \_ /sbin/modprobe dcdbas
>>   950 ?        S<     0:00  |           \_ /sbin/modprobe dcdbas
>>   951 ?        S<     0:00  |               \_ /sbin/modprobe dcdbas
>>   953 ?        S<     0:00  |                   \_ /sbin/modprobe dcdbas
>>   955 ?        S<     0:00  |                       \_ /sbin/modprobe dcdbas
>>   958 ?        S<     0:00  |                           \_
>> /sbin/modprobe dcdbas
>> ...repeat...
>>
>> When the system crashed, there were at least 11,600 instances of
>> "/sbin/modprobe dcdbas", each calling the next.
>>
>> Reverting 8f47f0b lets the system boot up just fine again.  Note that a
>> manual "modprobe dcdbas" also causes this recursive behavior, it's just
>> not forced on the system by udev.
>>
>> So dcdbas is a regression from 2.6.24, as well as being broken in other
>> ways.
>>
>> Frank
>> - --
>> Frank Sorenson - KD7TZK
>> Linux Systems Engineer, DSS Engineering, UBS AG
>> frank@...rocks.com
> 
> 
> Frank, what version of module-init-tools do you have?  This has been
> in use in Fedora 8 for a few months, and this is the first failure
> report I've seen.
> 
> I'm fine with reverting the patch for now, but really do want to get
> to root cause, because module autoloading is a really good idea, and
> it would be a shame if we couldn't keep that feature enabled because
> some systems have incompatible module-init-tools, and the kernel can't
> know that...  (Perhaps udev could know and not invoke modprobe in
> those instances?)
> 
> -Matt

It's module-init-tools-3.4-2.fc8.x86_64 (most recent Fedora rpm available).

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
frank@...rocks.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFH0pyoaI0dwg4A47wRAq9rAKCVbg5ngSyHVORpLAcD4WY4vNMQlQCdGtr1
9CiHmom5Vopsqukc8e+D1RU=
=GPMU
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ