lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111005172036.GD18592@gere.osrc.amd.com>
Date:	Wed, 5 Oct 2011 19:20:37 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	"Luck, Tony" <tony.luck@...el.com>,
	"K.Prasad" <prasad@...ux.vnet.ibm.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"crash-utility@...hat.com" <crash-utility@...hat.com>,
	"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
	Andi Kleen <andi@...stfloor.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	"anderson@...hat.com" <anderson@...hat.com>,
	"tachibana@....nes.nec.co.jp" <tachibana@....nes.nec.co.jp>,
	"oomichi@....nes.nec.co.jp" <oomichi@....nes.nec.co.jp>
Subject: Re: [Patch 1/4][kernel][slimdump] Add new elf-note of type
 NT_NOCOREDUMP to capture slimdump

On Wed, Oct 05, 2011 at 01:10:07PM -0400, Vivek Goyal wrote:
> On Wed, Oct 05, 2011 at 08:58:53AM -0700, Luck, Tony wrote:
> > > > The plan is to pass-down the list of poisoned memory pages to the second
> > > > kernel using an elf-note so that these pages are left untouched during
> > > > dump capture. I'm working on an implementation of the same and should
> > > > have patches soon.
> > >
> > > I would say let us first figure out what happens while reading a poisoned
> > > page and is this a problem before working on a solution.
> > 
> > If the page is poisoned because of a real uncorrectable error in memory
> > (reported as SRAO machine check today, or by SRAR real-soon-now). Then
> > accessing the page from the processor while taking a memory dump will
> > result in a machine check.
> > 
> > Note that a large memory system that had been running for a long time
> > may have built up a small stash of these land-mine pages - and we need
> > to worry about them even in the case where the panic is not machine
> > check related (in fact especially in this case ... we are in a case
> > where we actually do want the dump to diagnose the cause of the panic,
> > and we don't want to risk losing the crash dump because we aborted when
> > touching a page that the OS had safely avoided for days/weeks/months).
> > 
> > So passing a list of poisoned pages from the old kernel to the new kernel
> > is a good idea - and is independent of the cause of the crash (except that
> > in the fatal machine check case due to memory error the list is guaranteed
> > to be non-empty).
> 
> Whre is this poisoned page info stored? In struct page? If yes, then
> user space can walk through it and make sure not to touch poisoned pages.
> Anyway user space filtering utility "makedumpfile" walks through struct
> pages to filter out the pages. It should be able to filter out 
> poisoned pages unconditionally. So there should be no need for kernel
> to export a list of these pages.

Does this utility work on a vmcore dump? If so, Tony refers to the
creation of the vmcore itself from the memory used by the first kernel.
If there are poisoned pages, merely accessing that portion of DRAM
containing the poisoned data would cause further MCEs in the freshly
booted kernel so you won't be able to finish creating the dump.

Thus having a list of locations to sidestep could be one possible
solution.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ