[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0809042056270.3012@blonde.site>
Date: Thu, 4 Sep 2008 21:23:01 +0100 (BST)
From: Hugh Dickins <hugh@...itas.com>
To: Rafał Miłecki <zajec5@...il.com>
cc: Alan Jenkins <alan-jenkins@...fmail.co.uk>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Jeremy Fitzhardinge <jeremy@...p.org>,
Yinghai Lu <yhlu.kernel@...il.com>,
Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] x86: check for and defend against BIOS memory
corruption
On Thu, 4 Sep 2008, Rafał Miłecki wrote:
> > 2008/8/29 Rafał Miłecki <zajec5@...il.com>:
> > 2008/8/29 Hugh Dickins <hugh@...itas.com>:
> >> Here's my version of Jeremy's patch, that I've now tested on my machines,
> >> as x86_32 and as x86_64. It addresses none of the points Alan Cox made,
> >> and it stays silent for me, even after suspend+resume, unless I actually
> >> introduce corruption myself. Omits Jeremy's check in fault.c, but does
> >> a check every minute, so should soon detect Rafał's HDMI corruption
> >> without any need to suspend+resume.
> >
> > Your periodic test works fine:
> >
> > Corrupted low memory at ffff88000000be9c (be9c phys) = b02a0004
> > <IRQ> [<ffffffff8020fc9b>] check_for_bios_corruption+0x93/0x9f
> > [<ffffffff8020fca7>] ? periodic_check_for_corruption+0x0/0x25
> > [<ffffffff8020fcb0>] periodic_check_for_corruption+0x9/0x25
> >
> > By the way I confirmed this bug on Sony Vaio FW11M (my one is FW11S).
> > Probably more machines from FW11* are affected.
>
> If this patch is known to work fine for Sony Vaio FW* and Alan's
> machine, could it go mainline somehow?
Well.
Thanks for the prod, and I'm certainly remiss for not following
up sooner. But I'm really not at all keen on such a patch going
into mainline myself.
It's an interesting experiment, and I'd be happy to see such a patch
(adjusted to make sure output goes to kerneloops.org) spending a little
while in Fedora Rawhide (who'd be the right contact for that?).
But so far as mainline goes, I share Alan Cox's opinion that we should
not be chopping pages out of every x86 user's memory, just because a
couple of machines with faulty BIOSes have been observed.
Particularly now it's evident that the 64kB "limit" is no more than a
reflection of where the directmap pagetable changes have caught such
corruption.
If lots more such corruptions are reported, of course I would change
my position; but those bad directmap PMD crashes are themselves quite
recognizable now we know to look out for them.
I would prefer you both to use the minimal memmap= solutions for now;
but others may disagree.
Hugh
Powered by blists - more mailing lists