[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130410171722.GC19533@roeck-us.net>
Date: Wed, 10 Apr 2013 10:17:22 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: David Teigland <teigland@...hat.com>
Cc: Don Zickus <dzickus@...hat.com>, Dave Young <dyoung@...hat.com>,
linux-watchdog@...r.kernel.org, kexec@...ts.infradead.org,
wim@...ana.be, LKML <linux-kernel@...r.kernel.org>,
vgoyal@...hat.com
Subject: Re: [RFC PATCH] watchdog: Add hook for kicking in kdump path
On Wed, Apr 10, 2013 at 12:49:14PM -0400, David Teigland wrote:
> On Wed, Apr 10, 2013 at 09:40:39AM -0400, Don Zickus wrote:
> > However, we still have the problem that if the machine panics and we want
> > to jump into the kdump kernel, we need to 'kick' the watchdog one more
> > time. This provides us a sane sync point for determining how long we have
> > to load the watchdog driver in the second kernel before the hardware
> > reboots us. Otherwise the reboots are pretty random and nothing is
> > guaranteed.
>
> Some time ago I submitted this patch
> http://www.spinics.net/lists/linux-watchdog/msg01477.html
>
> to get rid of the one "extraneous" ping that was causing me trouble.
> I'd still like to see merged, but haven't had time to follow up.
>
The use case makes sense to me, so it gets my Ack. Did Wim ever comment on it ?
Thanks,
Guenter
> I have a use case where I need to guarantee that the watchdog
> will *not* be pinged unless my userland daemon does the ping.
> If my daemon is killed, the close() generates a ping that I
> don't intend. This kdump ping looks like it would be another
> instance that I'd need to suppress. Perhaps by renaming my flag
> WDOG_NO_EXTRA_PING and checking it both in release and in
> kick_for_kdump?
>
> (My daemon associates watchdog pings with shared storage heartbeats.
> Based on the heartbeats, hosts in a cluster can calculate when an
> unresponsive host last pinged its watchdog, and can be fairly
> certain that the "dead" host has been reset by its watchdog 60
> seconds later. This is used as an alternative to i/o fencing
> where we're protecting data on shared storage from corruption
> after host failures. If there are uncontrolled watchdog pings,
> then hosts don't know when a dead host might have last pinged
> its watchdog, since it is no longer based on the last timestamp
> it wrote to shared storage.)
>
> Dave
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists