[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20080805132957.398c1036@lxorguk.ukuu.org.uk>
Date: Tue, 5 Aug 2008 13:29:57 +0100
From: Alan Cox <alan@...rguk.ukuu.org.uk>
To: linasvepstas@...il.com
Cc: "Robert Hancock" <hancockr@...w.ca>,
"John Stoffel" <john@...ffel.org>,
"Alistair John Strachan" <alistair@...zero.co.uk>,
linux-kernel@...r.kernel.org
Subject: Re: amd64 sata_nv (massive) memory corruption
> have EDAC turned on, or something ... I'm investigating now.
> But this is moot -- if there is software that already exists that
> could have reported the error to the kernel, then this software
> should have been installed/enabled/operating by default.
That gets you into arguments with the people who care about performance
but its really a distribution level debate and I suspect the answer is
itself distro specific depending on usage/
> Personally I'm ready to pop $$$ for ECC it if will actually do
> something for me, this has been painful.
On a decent system ECC will do something. A modern server PC actually has
pretty good coverage on CPU L1, L2 and optionally RAM. I/O controllers
and disk internal caches seem to be a bit more variable which is one
reason big HPC cluster projects often checksum end to end - when you
produce terabytes of data all the one in a hundred billion error stats
start to look less than reassuring.
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists