lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Apr 2008 12:33:11 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Neil Horman <nhorman@...hat.com>
Cc:	vgoyal@...hat.com, nhorman@...hat.com, nickpiggin@...oo.com.au,
	k-miyoshi@...jp.nec.com, greg@...ah.com, bwalle@...e.de,
	kdb@....sgi.com, kexec@...ts.infradead.org, t-nagano@...jp.nec.com,
	linux-kernel@...r.kernel.org, rdunlap@...otime.net,
	ebiederm@...ssion.com, kaos@....com.au
Subject: Re: [PATCH 0/2] add new notifier function ,take3

On Mon, 14 Apr 2008 12:01:46 -0400
Neil Horman <nhorman@...hat.com> wrote:

> On Mon, Apr 14, 2008 at 10:53:23AM -0400, Vivek Goyal wrote:
> > On Mon, Apr 14, 2008 at 10:42:28AM -0400, Neil Horman wrote:
> > > On Mon, Apr 14, 2008 at 09:46:22AM -0400, Vivek Goyal wrote:
> > > > On Fri, Apr 11, 2008 at 09:07:51PM -0700, Andrew Morton wrote:
> > > > 
> > > > [..]
> > > > > > Kernel panic - not syncing: Panic by panic_module.
> > > > > > __tunable_atomic_notifier_call_chain enter
> > > > > > msg_handler:panic_event was called.
> > > > > > ipmi_wdog:wdog_panic_handler was called.
> > > > > > notifier_test: notifier_test_panic() is called.
> > > > > > notifier_test: notifier_test_panic2() is called.
> > > > > 
> > > > > OK.  But I don't see anywhere in here the most important piece of
> > > > > information: why do we need this feature in Linux?
> > > > > 
> > > > > What are the use-cases?  What is the value?  etc.
> > > > > 
> > > > > Often I can guess (but I like the originator to remove the guesswork).  In
> > > > > this case I'm stumped - I can't see any reason why anyone would want this.
> > > > > 
> > > > 
> > > > Hi Andrew,
> > > > 
> > > > To begin with, he wants kdb, kgdb etc to co-exist with kdump. He wants
> > > > to put all the RAS tools (who are interested in panic event) on a list
> > > > and export it to user space and let user decide in what order do the tool get
> > > > executed at panic time (based on priority).
> > > > 
> > > > This brings in little bit reliability concerns for kdump due to notifier
> > > > code being run after panic.
> > > > 
> > > > I think people want to use this infrastrutucure beyond RAS tools. I
> > > > remember somebody wanting to send a message to remote node after a
> > > > panic (before kdump kicks in)  so that remote node can initiate failover
> > > > etc.
> > > > 
> > > I know it doesn't particularly relate to this patch, but FWIW, for cases like
> > > failover, I've inserted infrastrucutre in the userspace part of kdump for
> > > Fedora/RHEL to support this sort of thing.  We can run arbitrary scripts righte
> > > before and after a capture so that notifications can be sent to remote nodes in
> > > a much safer fashion than using the notifier chain after a panic.
> > > Neil
> > > 
> > 
> > That's great. I did not know about these. So user can write custom
> > scripts/binaries which can be packed into kdump initrd and executed either
> > before or after dump capture? Any idea, if somebody has started using it
> > already?
> > 
> Thats exactly right.  I'm not sure if there is any serious use as of yet, but
> I've had some interrogatories about it.  Specific cases that I recall include:
> 
> 1) A set of users in japan that are using the pre-dump script to block execution
> until a scsi controller detects all its drives (it apparently takes up to three
> minues to scan its bus)
> 
> 2) I think some people using clustering services were using the pre-script to
> notify cluster peers of the failure to avoid power fencing while a node
> completed the crash dump
> 
> 3) A national lab had an interest in using the pre script to send an email to an
> administrative address to log the failure in a cluster 
> 

OK, thanks.

I think I'll duck the patch for now as it seems that a littlee more thought
and coordination is neeed.

Plus it appears that the only users of this infrastructure are provided via
presently-out-of-tree patches, so people who are already patching and
building their own kernels can easily add this other patch as well, for now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ