lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090615154832.73c89733@lxorguk.ukuu.org.uk>
Date:	Mon, 15 Jun 2009 15:48:32 +0100
From:	Alan Cox <alan@...rguk.ukuu.org.uk>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Mel Gorman <mel@....ul.ie>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Andi Kleen <andi@...stfloor.org>,
	"riel@...hat.com" <riel@...hat.com>,
	"chris.mason@...cle.com" <chris.mason@...cle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 00/22] HWPOISON: Intro (v5)

On Mon, 15 Jun 2009 15:29:34 +0200
Andi Kleen <andi@...stfloor.org> wrote:

> 
> I think you're wrong about killing processes decreasing
> reliability. Traditionally we always tried to keep things running if possible
> instead of panicing. That is why ext3 or block does not default to panic
> on each IO error for example. Or oops does not panic by default like
> on BSDs. Your argumentation would be good for a traditional early Unix
> which likes to panic instead of handling errors, but that's not the
> Linux way as I know it.

Everyone I knew in the business end of deploying Linux turned on panics
for I/O errors, reboot on panic and all the rest of those.

Why ? because they don't want a system where the web server is running
but not logging transactions, or to find out the database is up but that
some other "must not fail" layer killed or stalled the backup server for
it last week ...

The I/O ones can really blow up on you in a reliable environment because
often the process still exists but isn't working so fools much of the
monitoring software.

> That said you can configure it anyways to panic if you want,
> but it would be a very bad default.

That depends for whom

> See also Linus' or hpa's statement on the topic.

Linus doesn't run big server systems. Its a really bad default for
developers. Its probably a bad default for desktop users.

> We did a lot of testing with these separate test suites and also
> some other tests. For much more it needs actual users pounding on it, and that 
> can be only adequately done in mainline.

Thats why we have -next and -mm

> We did build tests on ia64 and power and it was reviewed by Tony for IA64.
> The ia64 specific code is not quite ready yet, but will come at some point.
> 
> I don't think it's a requirement for merging to have PPC64 support.

Really - so if your design is wrong for the way PPC wants to work what
are we going to do ? It's not a requirement that PPC64 support is there
but it is most certainly a requirement that its been in -next a while and
other arch maintainers have at least had time to say "works for me",
"irrelevant to my platform" or "Arghhh noooo.. ECC errors work like
[this] so we need ..."

I'd guess that zSeries has some rather different views on how ECC
failures propogate through the hypervisors for example, including the
fact that a failed page can be unfailed which you don't seem to allow for.

(You can unfail pages on x86 as well it appears by scrubbing them via DMA
- yes ?)


Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ