[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTik3XiOjYM67RdPG-3RM3rY-0vondiMmJsETAHfX@mail.gmail.com>
Date: Thu, 10 Jun 2010 12:38:10 -0600
From: Brian Gordon <legerde@...il.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Aerospace and linux
> It's also a serious consideration for standard servers.
Yes. Good point.
> On server class systems with ECC memory hardware does that.
> Normally server class hardware handles this and the kernel then reports
> memory errors (e.g. through mcelog or through EDAC)
Agreed. EDAC is a good and sane solution and most companies do this.
Some do not due to naivity or cost reduction. EDAC doesn't cover
processor registers and I have fairly good solutions on how to deal
with that in tiny "home-grown" tasking systems.
On the more exotic end, I have also seen systems that have dual
redundant processors / memories. Then they add compare logic between
the redundant processors that compare most pins each clock cycle. If
any pins are not identical at a clock cycle, then something has gone
wrong (SEU, hardware failure, etc..)
> Lower end systems which are optimized for cost generally ignore the
> problem though and any flipped bit in memory will result
> in a crash (if you're lucky) or silent data corruption (if you're unlucky)
Right! And this is the area that I am interested in. Some people
insist on lowering the cost of the hardware without considering these
issues. One thing I want to do is to be as diligent as possible (even
in these low cost situations) and do the best job I can in spite of
the low cost hardware.
So, some pages of RAM are going to be read-only and the data in those
pages came from some source (file system?). Can anyone describe a
high level strategy to occasionaly provide some coverage of this data?
So far I have thought about page descriptors adding an MD5 hash
whenever they are read-only and first being "loaded/mapped?" and then
a background daemon could occasionaly verify. Does tripwire
accomplish this kind of detection by monitoring the underlying
filesystem (I dont think so)?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists