[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D37FBE1.7010704@linux.vnet.ibm.com>
Date: Thu, 20 Jan 2011 14:39:53 +0530
From: Anithra P Janakiraman <anithra@...ux.vnet.ibm.com>
To: Américo Wang <xiyou.wangcong@...il.com>
CC: linux-kernel@...r.kernel.org,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
vatsa@...ux.vnet.ibm.com, Dave Hansen <dave@...ux.vnet.ibm.com>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>
Subject: Re: [PATCH 0/0] Panic on softdog timeout
On 01/18/2011 09:22 PM, Américo Wang wrote:
> On Tue, Jan 18, 2011 at 06:14:36PM +0530, Anithra P Janakiraman wrote:
>>
>> Hi.
>>
>> We currently have no way of determining the reason for failure when a
>> softdog timeout occurs. At the minimum a snapshot of the system would
>> help to determine the cause.
>> The attached patch invokes panic on softdog timeout iff kdump is
>> configured, if kdump is not configured it works as usual.
>>
>
> We don't do it in this way, check softlockup_panic, we have
> a boot parameter, i.e. "softlockup_panic=". :)
Some softdog specific scenarios cannot be handled by a softlockup
detector. We use softdog to watch for critical application failures,
where it is possible that the application has failed but there isn't a
softlockup as such.
For e.g. when doing high availability tests on applications, softdog is
setup so that the timer is reset by an application thread. In case of
the application failing the timer expires and causes a reboot. In such
scenarios some information on what caused the failure would be useful
and i don't see how softlockup can be used. The patch i had sent would
be useful in these cases. If I am missing something please do let me know.
I will make the modifications as suggested by Dave Hansen and post the
patch shortly.
Anithra.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists