lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110715141847.GD3428@redhat.com>
Date:	Fri, 15 Jul 2011 10:18:47 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Michael Holzheu <holzheu@...ux.vnet.ibm.com>
Cc:	Martin Schwidefsky <schwidefsky@...ibm.com>, ebiederm@...ssion.com,
	hbabu@...ibm.com, mahesh@...ux.vnet.ibm.com,
	oomichi@....nes.nec.co.jp, horms@...ge.net.au,
	heiko.carstens@...ibm.com, kexec@...ts.infradead.org,
	linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org
Subject: Re: [patch 0/9] kdump: Patch series for s390 support

On Fri, Jul 15, 2011 at 03:56:21PM +0200, Michael Holzheu wrote:
> Hello Vivec,
> 
> On Thu, 2011-07-14 at 13:55 -0400, Vivek Goyal wrote:
> 
> [snip]
> 
> > > The first thing we want to do is to check if
> > > the purgatory is still fine, that is do a checksum. If we have the
> > > infrastructure in place to do one checksum then we can easily do the
> > > other checksums as well.
> > 
> > Some piece of code you have to assume is fine. Are you not already
> > assuming that IPL code you have in first 64K bytes is fine and no
> > body has overwritten it.
> 
> We can assume that the IPL dump code is fine, because it is freshly
> loaded into memory. Only when the disk is somehow corrupted we have a
> problem.
> 
> > Are you not assuming that hook in panic()
> > (I think you are calling it shutdown trigger) is fine so that it
> > can help you jump to right place.
> 
> Yes, that is correct for automatic dump in case of panic(). The panic()
> path can fail.
> 
> But there are two other options where really *no* code that was in
> memory, when the system crashed, is used for the dump process or
> verification of kdump:
> 1) Manual IPL/boot of stand-alone dump by the operator via the virtual
> guest console
> 2) Automatic IPL/boot of stand-alone dump by our z/VM hypervisor
> watchdog

Hi Michael,

Ok. So IIUC, then purgatory code corruption is equivalent of panic() code
corruption and in that case above two options will help an admin capture
the dump.

That's precisely the point I am trying to make that stand alone dump
tools still remains the backup mechanism when kdump fails. Kdump can
fail ether because checksum of loaded kernel is bad or because purgatory
code itself got corrupted. In first case, purgatory itself can make
sure of jumping to location to IPL the dump tools and in second case
above two options will come into picture (manual dump via operator or
hypervisor watchdog initiated IPL).

If we go this path, this will should simplify the design a lot. dump
tools don't have to know anything about kdump kernel and there is no
need to pass any information. 

And in common case kdump should be able to capture the dump and filter
it. Only in extreme corner cases, we need to trigger this dump tool
mechanism and capture full memory dump.

How about doing it that way. This should not require much chagens in
common kexec code. Will require some changes in kexec-tools though, 
as you shall have to create a mechanism for purgatory to jump to in
case kdump kernel checksum fails.

Thanks
Vivek 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ