linux-kernel - Re: 2.6.22-rc4-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070607165916.GZ11115@waste.org>
Date:	Thu, 7 Jun 2007 11:59:16 -0500
From:	Matt Mackall <mpm@...enic.com>
To:	WANG Cong <xiyou.wangcong@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: 2.6.22-rc4-mm1

On Fri, Jun 08, 2007 at 12:39:30AM +0800, WANG Cong wrote:
> >Ketchup doesn't even look inside patches, and patch doesn't invent
> >names, so something in the bzip2 -> patch(1) -> filesystem chain got
> >corrupted. Probably not bzip2, as it has CRCs.
> >
> 
> Do you mean ketchup doesn't do anything if a file is corrupted?

Ketchup never even sees the filenames. It just calls bzip2 | patch. So
it can't be responsible for damaging the filename.
 
> >Do you have ECC memory?
>
> No. Do you mean it's an error of my RAM? I have never met such things before,
> how often does such kind of things happen? May be less often than a bug in
> a stable kernel?

The best studies I've seen suggest so-called "soft errors" in DRAM
happen at a rate of once a week to once a day per gigabyte of RAM at
sea-level. It's unknown how many of these errors manifest by visibly
corrupting data, but it wouldn't be surprising if it were
significantly less than 10%. But ECC is definitely not just for the
paranoid!

So if I were to rank the reliability of everything, it'd look
something like this, highest to lowest:

 bzip: simple, stable and heavily-used codebase, built-in safeguards like CRC
 patch: simple, stable, heavily-used, limited detection of input errors
 CPU: heavily used, very low non-catastrophic failure rate
 disk: heavily used, CRC on cable, ECC on disk
 kernel: complex, rapidly-changing, but heavily-used
 Non-ECC DRAM: significant known transient failure rate
 
When the error rate for the kernel approaches that of DRAM, it gets
very hard to assign blame.

(And of course, there's the user, who tends to be near the bottom of
this range, but I'll let you judge that.)

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/