linux-kernel - Re: Hardware Error Kernel Mini-Summit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100518193437.GC30936@elte.hu>
Date:	Tue, 18 May 2010 21:34:37 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Borislav Petkov <bp@...64.org>
Cc:	"Luck, Tony" <tony.luck@...el.com>, Joe Perches <joe@...ches.com>,
	Mauro Carvalho Chehab <mchehab@...hat.com>,
	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"bluesmoke-devel@...ts.sourceforge.net" 
	<bluesmoke-devel@...ts.sourceforge.net>,
	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Ben Woodard <woodard@...hat.com>,
	Matt Domsch <Matt_Domsch@...l.com>,
	Doug Thompson <dougthompson@...ssion.com>,
	"Young, Brent" <brent.young@...el.com>
Subject: Re: Hardware Error Kernel Mini-Summit


* Borislav Petkov <bp@...64.org> wrote:

> Well, we have a trace_mce_record tracepoint in the 
> mcheck code which calls all the necessary callbacks when 
> an mcheck occurs. For the time being, the idea is to use 
> the mce.c ring buffer for early mchecks and copy them to 
> the regular ftrace per-cpu buffer after the last has 
> been initialized. Later, we could switch to a another 
> early bootmem buffer if there's need to.

The end result would be even simpler by one more step: 
with persistent events we just use them and dont need the 
mce.c ringbuffer at all. (getting rid of that complication 
is one of the code cleanliness benefits i see in this move 
as a x86 maintainer - beyond the obvious generalization 
and unification benefits.)

> Also, we want to have a userspace daemon that reads out 
> the mces from the trace buffer and does further 
> processing like thresholding etc in userspace.
> 
> Concerning critical errors, there we bypass the perf 
> subsystem and execute the smallest amount of code 
> possible while trying to shutdown gracefully if the 
> error type allows that.

Yeah. Each perf_event can have arbitrary callbacks with 
add-on (or critical) functionality. We would activate the 
event(s) during bootup and it would do its thing from that 
point on: critical functionality gets a direct path via 
the callback, and every other event that survives goes via 
the regular perf output channels, to one (or more) 
consumers/subscribers of these events.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/