[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1346100297.4732.115.camel@misato.fc.hp.com>
Date: Mon, 27 Aug 2012 14:44:57 -0600
From: Toshi Kani <toshi.kani@...com>
To: "Mingarelli, Thomas" <Thomas.Mingarelli@...com>
Cc: Lars Marowsky-Bree <lmb@...e.com>, "wim@...ana.be" <wim@...ana.be>,
"linux-watchdog@...r.kernel.org" <linux-watchdog@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH] hpwdt: Fix kdump issue in hpwdt
On Mon, 2012-08-27 at 19:57 +0000, Mingarelli, Thomas wrote:
> The main issue here is when an NMI comes in (which is hpwdt's main
> focus...to source NMIs and then panic the box) and the system is
> configured for kdump. We want the kdump to succeed and if the iLO
> watchdog timer is left alone to keep running, the kdump will not
> succeed. It will be interrupted by an ASR. This change ensures that
> the iLO Watchdog timer is always stopped in the booting case (of any
> kernel) or when an NMI arrives and we are in the process of taking a
> kdump.
And this change does not prevent running the watchdog daemon on the
crash kernel, if we want to detect a hang condition on the crash kernel.
The timer is re-enabled when /dev/watchdog is opened. The change only
assures the timer is enabled when the daemon starts up. The timer
running on the crash kernel without starting the daemon is a problem as
it leads kdump to be interrupted.
Thanks,
-Toshi
>
> Tom
>
> -----Original Message-----
> From: Lars Marowsky-Bree [mailto:lmb@...e.com]
> Sent: Monday, August 27, 2012 2:22 PM
> To: Kani, Toshimitsu; wim@...ana.be; linux-watchdog@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org; Mingarelli, Thomas; stable@...r.kernel.org
> Subject: Re: [PATCH] hpwdt: Fix kdump issue in hpwdt
>
> On 2012-08-27T12:52:24, Toshi Kani <toshi.kani@...com> wrote:
>
> > kdump can be interrupted by watchdog timer when the timer is left
> > activated on the crash kernel. Changed the hpwdt driver to disable
> > watchdog timer at boot-time. This assures that watchdog timer is
> > disabled until /dev/watchdog is opened, and prevents watchdog timer
> > to be left running on the crash kernel.
>
> How does this protect against the system hanging again in the crash
> kernel, or possibly hardware caches to flush more data to shared
> storage?
>
> (I'm asking from the perspective of the hpwdt being used as a fencing
> mechanism in a cluster setting.)
>
> Or is the argument that it's "very unlikely" that a system in such a
> state would not make it far enough into the crash kernel?
>
>
> Regards,
> Lars
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists