[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090615064447.GA18390@wotan.suse.de>
Date: Mon, 15 Jun 2009 08:44:47 +0200
From: Nick Piggin <npiggin@...e.de>
To: Wu Fengguang <fengguang.wu@...el.com>
Cc: Balbir Singh <balbir@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
Mel Gorman <mel@....ul.ie>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Hugh Dickins <hugh.dickins@...cali.co.uk>,
Andi Kleen <andi@...stfloor.org>,
"riel@...hat.com" <riel@...hat.com>,
"chris.mason@...cle.com" <chris.mason@...cle.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 00/22] HWPOISON: Intro (v5)
On Mon, Jun 15, 2009 at 12:27:53PM +0800, Wu Fengguang wrote:
> On Mon, Jun 15, 2009 at 11:18:18AM +0800, Balbir Singh wrote:
> > Wu Fengguang wrote:
> > > Hi all,
> > >
> > > Comments are warmly welcome on the newly introduced uevent code :)
> > >
> > > I hope we can reach consensus in this round and then be able to post
> > > a final version for .31 inclusion.
> >
> > Isn't that too aggressive? .31 is already in the merge window.
>
> Yes, a bit aggressive. This is a new feature that involves complex logics.
> However it is basically a no-op when there are no memory errors,
> and when memory corruption does occur, it's better to (possibly) panic
> in this code than to panic unconditionally in the absence of this
> feature (as said by Rik).
>
> So IMHO it's OK for .31 as long as we agree on the user interfaces,
> ie. /proc/sys/vm/memory_failure_early_kill and the hwpoison uevent.
>
> It comes a long way through numerous reviews, and I believe all the
> important issues and concerns have been addressed. Nick, Rik, Hugh,
> Ingo, ... what are your opinions? Is the uevent good enough to meet
> your request to "die hard" or "die gracefully" or whatever on memory
> failure events?
Uevent? As in, send a message to userspace? I don't think this
would be ideal for a fail-stop/failover situation.
I can't see a good reason to rush to merge it.
IMO the userspace-visible changes have maybe not been considered
too thoroughly, which is what I'd be most worried about. I probably
missed seeing documentation of exact semantics and situations
where admins should tune things one way or the other.
Did we verify with filesystem maintainers (eg. btrfs) that the
!ISREG test will be enough to prevent oopses?
I hope it is going to be merged with an easy-to-use fault injector,
because that is the only way Joe kernel developer is ever going to
test it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists