linux-kernel - Re: 2.6.22-rc4-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070608044301.GA2822@localhost.localdomain>
Date:	Fri, 8 Jun 2007 12:43:01 +0800
From:	WANG Cong <xiyou.wangcong@...il.com>
To:	Matt Mackall <mpm@...enic.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: 2.6.22-rc4-mm1

On Thu, Jun 07, 2007 at 11:59:16AM -0500, Matt Mackall wrote:
>On Fri, Jun 08, 2007 at 12:39:30AM +0800, WANG Cong wrote:
>> >Ketchup doesn't even look inside patches, and patch doesn't invent
>> >names, so something in the bzip2 -> patch(1) -> filesystem chain got
>> >corrupted. Probably not bzip2, as it has CRCs.
>> >
>> 
>> Do you mean ketchup doesn't do anything if a file is corrupted?
>
>Ketchup never even sees the filenames. It just calls bzip2 | patch. So
>it can't be responsible for damaging the filename.
> 
>> >Do you have ECC memory?
>>
>> No. Do you mean it's an error of my RAM? I have never met such things before,
>> how often does such kind of things happen? May be less often than a bug in
>> a stable kernel?
>
>The best studies I've seen suggest so-called "soft errors" in DRAM
>happen at a rate of once a week to once a day per gigabyte of RAM at
>sea-level. It's unknown how many of these errors manifest by visibly
>corrupting data, but it wouldn't be surprising if it were
>significantly less than 10%. But ECC is definitely not just for the
>paranoid!
>
>So if I were to rank the reliability of everything, it'd look
>something like this, highest to lowest:
>
> bzip: simple, stable and heavily-used codebase, built-in safeguards like CRC
> patch: simple, stable, heavily-used, limited detection of input errors
> CPU: heavily used, very low non-catastrophic failure rate
> disk: heavily used, CRC on cable, ECC on disk
> kernel: complex, rapidly-changing, but heavily-used
> Non-ECC DRAM: significant known transient failure rate
> 
>When the error rate for the kernel approaches that of DRAM, it gets
>very hard to assign blame.
>
>(And of course, there's the user, who tends to be near the bottom of
>this range, but I'll let you judge that.)

Good explanation! Thanks!

That the RAM error occurs so often really surprises me.
I think it might be RAM's fault, because, at least, it's less reproduceable than
a bug in a stable kernel.


Regards!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/