lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070607165916.GZ11115@waste.org>
Date:	Thu, 7 Jun 2007 11:59:16 -0500
From:	Matt Mackall <mpm@...enic.com>
To:	WANG Cong <xiyou.wangcong@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: 2.6.22-rc4-mm1

On Fri, Jun 08, 2007 at 12:39:30AM +0800, WANG Cong wrote:
> >Ketchup doesn't even look inside patches, and patch doesn't invent
> >names, so something in the bzip2 -> patch(1) -> filesystem chain got
> >corrupted. Probably not bzip2, as it has CRCs.
> >
> 
> Do you mean ketchup doesn't do anything if a file is corrupted?

Ketchup never even sees the filenames. It just calls bzip2 | patch. So
it can't be responsible for damaging the filename.
 
> >Do you have ECC memory?
>
> No. Do you mean it's an error of my RAM? I have never met such things before,
> how often does such kind of things happen? May be less often than a bug in
> a stable kernel?

The best studies I've seen suggest so-called "soft errors" in DRAM
happen at a rate of once a week to once a day per gigabyte of RAM at
sea-level. It's unknown how many of these errors manifest by visibly
corrupting data, but it wouldn't be surprising if it were
significantly less than 10%. But ECC is definitely not just for the
paranoid!

So if I were to rank the reliability of everything, it'd look
something like this, highest to lowest:

 bzip: simple, stable and heavily-used codebase, built-in safeguards like CRC
 patch: simple, stable, heavily-used, limited detection of input errors
 CPU: heavily used, very low non-catastrophic failure rate
 disk: heavily used, CRC on cable, ECC on disk
 kernel: complex, rapidly-changing, but heavily-used
 Non-ECC DRAM: significant known transient failure rate
 
When the error rate for the kernel approaches that of DRAM, it gets
very hard to assign blame.

(And of course, there's the user, who tends to be near the bottom of
this range, but I'll let you judge that.)

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ