[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20061012140644.732227f7.rdunlap@xenotime.net>
Date: Thu, 12 Oct 2006 14:06:44 -0700
From: Randy Dunlap <rdunlap@...otime.net>
To: Edward Goggin <egoggin@....com>
Cc: <linux-kernel@...r.kernel.org>
Subject: Re: looking for explanation of spontaneous reset/reboot on Opteron
On Thu, 12 Oct 2006 13:43:15 -0400 Edward Goggin wrote:
> I'm looking for information about potential causes for a
> spontaneous reboot of a dual core Opteron running RHEL 4
> linux (2.6.9 derivative). I'm thinking the cause is
> due to the box taking a triple fault and resetting the
> BIOS.
>
> I'm also thinking that the triple fault is caused by a
> kernel stack overflow into an unmapped virtual page
> frame. Is this a reasonable explanation? Are there
> others?
>
> What are reasonable debugging strategies for handling
> this? Following the kernel stack overflow hunch, I'm
> going to try increasing the Opteron stack size from
> 8K to 16K. Can this be done by simply changing
> THREAD_ORDER in include/asm-x86_64/page.h from 1 to 2?
>
> Also, is there any kernel stack overflow detection
> debugging code anywhere for x86_64 as there is for
> i386?
[Please start a new thread instead of replying to a diff.
one and changing the subject.]
arch/x86_64/Kconfig.debug has these Kernel hacking options:
[but I'm not talking about 2.6.9; maybe you should/could try
a newer kernel]
config DEBUG_STACKOVERFLOW
bool "Check for stack overflows"
depends on DEBUG_KERNEL
help
This option will cause messages to be printed if free stack space
drops below a certain limit.
config DEBUG_STACK_USAGE
bool "Stack utilization instrumentation"
depends on DEBUG_KERNEL
help
Enables the display of the minimum amount of free stack which each
task has ever had available in the sysrq-T and sysrq-P debug output.
This option will slow down process creation somewhat.
---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists