lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081104224510.GA7672@mit.edu>
Date:	Tue, 4 Nov 2008 17:45:10 -0500
From:	Theodore Tso <tytso@....edu>
To:	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: 2.6.28-rc2: REGRESSION in early boot

I've opened a Kernel Bug to track this regression:

     http://bugzilla.kernel.org/show_bug.cgi?id=11951

My fileserver boots under 2.6.27, but it is failing to boot on
2.6.28-rc2.  It took me a while to bisect, so after I finished the
bisection, I retested with the latest mainline
(v2.6.28-rc3-54-g75fa677), and the problem still shows up.

Essentially, the system panics in early boot, resulting in multiple
oops.  I finally was able get the very first oops, and the image of
that oops can be found here:

http://thunk.org/tytso/2.6.27-regress/92b29b8/IMG_0331.JPG

>From the console snapshot, it looks like two CPU simultaneously
OOPS'ed with a:

BUG: unable to handle kernel NULL dereference at 00000000
BUG: unable to handle kernel NULL dereference at 00000038

On the stack is "scheduler_tick+0x83/0x15f"

When doing a bisection, the last good commit (i.e., the last one which
I can boot on my system) is git id: d6c88a50 (which preceeds 2.6.28-rc1).

The first bad git ID is:

commit d6c88a507ef0b6afdb013cba4e7804ba7324d99a
Author: Thomas Gleixner <tglx@...utronix.de>
Date:   Wed Oct 15 15:27:23 2008 +0200

    genirq: revert dynarray
    
    Revert the dynarray changes. They need more thought and polishing.
    
    Signed-off-by: Thomas Gleixner <tglx@...utronix.de>

... but in fact, the failure is different from the above messages.
The failure is also in early boot, but the oops message is quite
different:

http://thunk.org/tytso/2.6.27-regress/b9d7ccf/IMG_0322.JPG

The failure was in kmem_cache_alloc+0xab/0xc4, called by
__create_workqueue_key+0x21/0x145.

Walking forwards, the first git ID which shows the same failure as
what shows up in 2.6.28-rc2 and 2.6.28-rc3 is git ID: 92b29b8, which
apparently is a merge of the tracing-v28-for-linus branch.  Because
there were two failures back-to-back, it's possible either I or git
bisect got confused, since normally the bisect normally doesn't
terminate on a merge commit.  I'll try double-checking the two
ancestors of the merge commit by hand, but in the meantime I thought
I'd send what I have in case it rings a bell.

Again, my system is totally failing to boot since 2.6.28-rc1.  This is
a Aberdeen (white box) fileserver, with a SuperMicro X6DH8-XG2
motherboard, with two Pentium 4 Xeon 3.0GHz with hyperthreading.

I've enclosed a successful dmesg for a 2.6.27 kernel, my .config file,
my /proc/cpuinfo, my /proc/timer_list, and my bisection log.

						- Ted

View attachment "dmesg-2.6.27" of type "text/plain" (49143 bytes)

View attachment "config-2.6.27-04409-ga1aca5d" of type "text/plain" (61090 bytes)

View attachment "cpuinfo" of type "text/plain" (2436 bytes)

View attachment "timer_list" of type "text/plain" (6467 bytes)

View attachment "bisect.log" of type "text/plain" (2627 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ