[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100524155506.GA7145@sgi.com>
Date: Mon, 24 May 2010 10:55:06 -0500
From: Russ Anderson <rja@....com>
To: Tony Luck <tony.luck@...il.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Andi Kleen <andi@...stfloor.org>,
Borislav Petkov <bp@...64.org>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
Mauro Carvalho Chehab <mchehab@...hat.com>,
"Young, Brent" <brent.young@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Matt Domsch <Matt_Domsch@...l.com>,
Doug Thompson <dougthompson@...ssion.com>,
Joe Perches <joe@...ches.com>, Ingo Molnar <mingo@...e.hu>,
"bluesmoke-devel@...ts.sourceforge.net"
<bluesmoke-devel@...ts.sourceforge.net>,
Linux Edac Mailing List <linux-edac@...r.kernel.org>,
rja@....com
Subject: Re: Hardware Error Kernel Mini-Summit
On Wed, May 19, 2010 at 10:30:17AM -0700, Tony Luck wrote:
>
> We are still in the dark ages for memory errors where the OS
> is expected to look at all the errors and figure out whether they
> represent any kind of meaningful pattern that requires some
> action to replace h/w components.
ia64 is good at detecting & recovering from memory uncorrectable
errors. x86 is significantly behind, due to historically not
being able to recover from uncorrectable memory errors.
ia64 had the Intel defined MCA Spec which defined the interaction
between SAL and the kernel. x86 does not have a similar well
defined way of how errors should be handled. It would be
good to agree on how the errors should be handled.
> -Tony
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@....com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists