[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100518193437.GC30936@elte.hu>
Date: Tue, 18 May 2010 21:34:37 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Borislav Petkov <bp@...64.org>
Cc: "Luck, Tony" <tony.luck@...el.com>, Joe Perches <joe@...ches.com>,
Mauro Carvalho Chehab <mchehab@...hat.com>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"bluesmoke-devel@...ts.sourceforge.net"
<bluesmoke-devel@...ts.sourceforge.net>,
Linux Edac Mailing List <linux-edac@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Ben Woodard <woodard@...hat.com>,
Matt Domsch <Matt_Domsch@...l.com>,
Doug Thompson <dougthompson@...ssion.com>,
"Young, Brent" <brent.young@...el.com>
Subject: Re: Hardware Error Kernel Mini-Summit
* Borislav Petkov <bp@...64.org> wrote:
> Well, we have a trace_mce_record tracepoint in the
> mcheck code which calls all the necessary callbacks when
> an mcheck occurs. For the time being, the idea is to use
> the mce.c ring buffer for early mchecks and copy them to
> the regular ftrace per-cpu buffer after the last has
> been initialized. Later, we could switch to a another
> early bootmem buffer if there's need to.
The end result would be even simpler by one more step:
with persistent events we just use them and dont need the
mce.c ringbuffer at all. (getting rid of that complication
is one of the code cleanliness benefits i see in this move
as a x86 maintainer - beyond the obvious generalization
and unification benefits.)
> Also, we want to have a userspace daemon that reads out
> the mces from the trace buffer and does further
> processing like thresholding etc in userspace.
>
> Concerning critical errors, there we bypass the perf
> subsystem and execute the smallest amount of code
> possible while trying to shutdown gracefully if the
> error type allows that.
Yeah. Each perf_event can have arbitrary callbacks with
add-on (or critical) functionality. We would activate the
event(s) during bootup and it would do its thing from that
point on: critical functionality gets a direct path via
the callback, and every other event that survives goes via
the regular perf output channels, to one (or more)
consumers/subscribers of these events.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists