lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 10 Apr 2013 12:49:14 -0400
From:	David Teigland <teigland@...hat.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	Guenter Roeck <linux@...ck-us.net>, Dave Young <dyoung@...hat.com>,
	linux-watchdog@...r.kernel.org, kexec@...ts.infradead.org,
	wim@...ana.be, LKML <linux-kernel@...r.kernel.org>,
	vgoyal@...hat.com
Subject: Re: [RFC PATCH] watchdog: Add hook for kicking in kdump path

On Wed, Apr 10, 2013 at 09:40:39AM -0400, Don Zickus wrote:
> However, we still have the problem that if the machine panics and we want
> to jump into the kdump kernel, we need to 'kick' the watchdog one more
> time.  This provides us a sane sync point for determining how long we have
> to load the watchdog driver in the second kernel before the hardware
> reboots us.  Otherwise the reboots are pretty random and nothing is
> guaranteed.

Some time ago I submitted this patch
http://www.spinics.net/lists/linux-watchdog/msg01477.html

to get rid of the one "extraneous" ping that was causing me trouble.
I'd still like to see merged, but haven't had time to follow up.

I have a use case where I need to guarantee that the watchdog
will *not* be pinged unless my userland daemon does the ping.
If my daemon is killed, the close() generates a ping that I
don't intend.  This kdump ping looks like it would be another
instance that I'd need to suppress.  Perhaps by renaming my flag
WDOG_NO_EXTRA_PING and checking it both in release and in
kick_for_kdump?

(My daemon associates watchdog pings with shared storage heartbeats.
Based on the heartbeats, hosts in a cluster can calculate when an
unresponsive host last pinged its watchdog, and can be fairly
certain that the "dead" host has been reset by its watchdog 60
seconds later.  This is used as an alternative to i/o fencing
where we're protecting data on shared storage from corruption
after host failures.  If there are uncontrolled watchdog pings,
then hosts don't know when a dead host might have last pinged
its watchdog, since it is no longer based on the last timestamp
it wrote to shared storage.)

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists