lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20080811221616.4f6baaff@neptune.home>
Date:	Mon, 11 Aug 2008 22:16:16 +0200
From:	Bruno Prémont <bonbons@...ux-vserver.org>
To:	Suresh Siddha <suresh.b.siddha@...el.com>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Al Viro <viro@...IV.linux.org.uk>,
	"Rafael J. Wysocki" <rjw@...k.pl>, linux-kernel@...r.kernel.org,
	linux-scsi@...r.kernel.org, linux-ide@...r.kernel.org
Subject: Re: [2.6.27-rc2-git4] Kernel panic on VIA Ester+VIA CX700

On Mon, 11 August 2008 Suresh Siddha <suresh.b.siddha@...el.com> wrote:
> On Sun, Aug 10, 2008 at 04:39:23PM -0700, H. Peter Anvin wrote:
> > Bruno Prémont wrote:
> > >
> > > Recompiling without viafb+squashfs patches makes the panic go
> > > away.
> > >
> > > So something from viafb or squashfs triggers the panic or prepare
> > > for something else to trigger it...
> > >
> > 
> > Out of those, viafb by far seems most likely.  Could you try
> > compiling with only one or the other?
> 
> [    5.010629] general protection fault: 0000 [#1] 
> [    5.021782] Modules linked in:
> [    5.030227] 
> [    5.030227] Pid: 3, comm: ksoftirqd/0 Not tainted (2.6.27-rc2-git4_nocrypto #1)
> [    5.030227] EIP: 0060:[<c01042f5>] EFLAGS: 00010046 CPU: 0
> [    5.030227] EIP is at math_state_restore+0x25/0x60
> [    5.030227] EAX: f781db3d EBX: f781d898 ECX: 00000000 EDX: 00000000
> [    5.030227] ESI: f781d000 EDI: f782c20c EBP: f781d840 ESP: f781d838
> [    5.030227]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [    5.030227] Process ksoftirqd/0 (pid: 3, ti=f781d000 task=f782c6a0 task.ti=f782a000)
> [    5.030227] Stack: f782c000 f782c6a0 f781d890 c01038dd f782c000 00000000 f782c000 f782c6a0
> [    5.030227]        f782c20c f781d890 f7807e00 0000007b 0000007b 00000000 ffffffff c0101e8f
> [    5.030227]        00000060 00010002 f782c8ac f782c000 c04d8180 c011fa50 f782afc0 c03cfa61
> [    5.030227] Call Trace:
> [    5.030227]  [<c01038dd>] ? device_not_available+0x2d/0x32
> [    5.030227]  [<c0101e8f>] ? __switch_to+0x2f/0x130
> 
> It got a GP fault, because in __switch_to() we were doing
> unlazy_fpu() and fxsave generated a DNA fault(which shouldn't happen
> unless we are hitting the via padlock instruction issue or something
> else) and the math_state_restore() found the task's math state
> pointer to be 0xf781db3d (EAX in the oops) and while doing fxrstor we
> got GP fault, as the fxrstor pointer(EAX) is not 16byte aligned.
> 
> It is interesting to see the EAX value similar to stack pointer.
> Task's FP area gets dynamically allocated and as such EAX def looks
> wrong here. I also see the config is using 4K stacks. Some
> config(viafb/squashfs?) causing some thing wrong with the kernel
> stack? --
That's pretty possible...

I just recompiled (enabling some more stack debugging - which didn't help),
then I disabled 4k-stack and now system boots up...

Anything I forgot the enable to get stacktrace when stack is overflowing
instead of at a random time later on?
Also wondering that maximum stack usage is only printed for userspace apps
or kernel threads once init is running... is the stack usage not checked
earlier during boot process?

Changes to posted config:
 CONFIG_X86_VERBOSE_BOOTUP=y
 CONFIG_EARLY_PRINTK=y
 CONFIG_DEBUG_STACKOVERFLOW=y
-# CONFIG_DEBUG_STACK_USAGE is not set
+CONFIG_DEBUG_STACK_USAGE=y
 # CONFIG_DEBUG_PAGEALLOC is not set
 # CONFIG_X86_PTDUMP is not set
 CONFIG_DEBUG_RODATA=y
 # CONFIG_DEBUG_RODATA_TEST is not set
 # CONFIG_DEBUG_NX_TEST is not set
-CONFIG_4KSTACKS=y
+# CONFIG_4KSTACKS is not set
 CONFIG_DOUBLEFAULT=y
 # CONFIG_MMIOTRACE is not set
 CONFIG_IO_DELAY_TYPE_0X80=0

Attached are bootlog with 4k stack and 8k stacks using above config diff
(with appropriate CONFIG_4KSTACKS)

Bruno

View attachment "venus-2.6.27-rc2.4-stack4k" of type "text/plain" (23975 bytes)

View attachment "venus-2.6.27-rc2.4-stack8k" of type "text/plain" (22265 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ