lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111014141450.GB4142@redhat.com>
Date:	Fri, 14 Oct 2011 10:14:50 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	"K.Prasad" <prasad@...ux.vnet.ibm.com>
Cc:	Borislav Petkov <bp@...en8.de>, linux-kernel@...r.kernel.org,
	crash-utility@...hat.com, kexec@...ts.infradead.org,
	Andi Kleen <andi@...stfloor.org>,
	"Luck, Tony" <tony.luck@...el.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>, anderson@...hat.com,
	tachibana@....nes.nec.co.jp, oomichi@....nes.nec.co.jp,
	Valdis.Kletnieks@...edu, Nick Bowler <nbowler@...iptictech.com>
Subject: Re: [Patch 1/4][kernel][slimdump] Add new elf-note of type
 NT_NOCOREDUMP to capture slimdump

On Fri, Oct 14, 2011 at 05:00:25PM +0530, K.Prasad wrote:
> On Wed, Oct 12, 2011 at 11:51:44AM -0400, Vivek Goyal wrote:
> > On Wed, Oct 12, 2011 at 12:14:34AM +0530, K.Prasad wrote:
> > > On Mon, Oct 10, 2011 at 09:07:25AM +0200, Borislav Petkov wrote:
> > > > On Fri, Oct 07, 2011 at 09:42:19PM +0530, K.Prasad wrote:
> [snipped]
> > > 
> > > ii) Scenario2: System with PG_hwpoison (or landmine!) pages crashes because
> > > of a software bug. In this case, kexec kernel would normally reboot because
> > > of reading the PG_poison page. I'll soon get a new version of the patchset
> > > implementing this.
> > > 
> > > Solution: Maintain a linked list of PFNs when the corresponding 'struct page'
> > > has been marked PG_hwpoison. We could export/put this list to use in
> > > quite a few ways.
> > 
> > What's the need of a list and why do we have to export anything. Can't
> > makedumpfile look at the struct page and then just not dump that page if
> > hwpoison flag is set.
> >
> 
> I'll respond to just this part of the comment for now, since I have a
> few conflicting thoughts crossing my mind regarding the above suggestion
> and thought I'll put it across to the community to get that clarified.
> 
> Using makedumpfile to actually identify and sidestep PG_hwpoison sounds
> a bit dangerous. Let's for a moment that makedumpfile has this
> capability, which is implemented as under.
> 
> - The list of nodes (pg_data_t) and all struct page's (through
>   node_mem_map) are sent to makedumpfile using VMCOREINFO_SYMBOL().
> 
> - makedumpfile would use this information to go to the old kernel's
>   memory, look at pg_data_t and then into each element of node_mem_map
>   to then lookout for PG_hwpoison inside 'struct page'->flags. (Well,
>   this method works for !SPARSEMEM. I'd like to know if I've overlooked
>   any other better method. pfn_to_page() wouldn't work either, as it will
>   give a 'struct page' of a PFN as seen by the kexec'd kernel and not
>   the crashed kernel).
> 
> - If PG_hwpoison flag for the corresponding page is clear, then it
>   will allow the copy operation.
> 
> - The problem comes when we actually land on a page with PG_hwpoison
>   while carrying out the above 3 steps. For instance, if the page
>   containing the pg_data_t and node_mem_map data structures themselves
>   are marked hw-poisoned.

I think it can happen and in that case we don't capture the dump. This
is similar to possibility of running into a accessing a poisoned page
while you are trying to same the final note which will contain the
MCE info or list of poisoned pages.

Even if you export the list successfuly and you find pd_data_t pages
are poisoned, what would you do? Not do filtering and save tera bytes
of dump.

I think you are just trying to solve every corner case which might
not even be required in practice. Kdump is our best effort to capture
the dump and there are so many corner cases where it will not work.

So I would suggest that lets us not make the whole thing too complicated
now. If the scenario you are describing becomes common enough that
it start bothering, we can look into exporting the poisoned pages list.

Thanks
Vivek 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ