[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110504065843.GC20828@elte.hu>
Date: Wed, 4 May 2011 08:58:43 +0200
From: Ingo Molnar <mingo@...e.hu>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Borislav Petkov <bp@...64.org>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Mauro Carvalho Chehab <mchehab@...hat.com>,
EDAC devel <linux-edac@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
"Petkov, Borislav" <Borislav.Petkov@....com>
Subject: Re: [PATCH 4/4] x86, mce: Have MCE persistent event off by default
for now
* Luck, Tony <tony.luck@...el.com> wrote:
> > Ok, the problem I see with it is that people without a RAS daemon
> > running will have the mechanism collecting MCEs in the background, using
> > up resources (4 pages per CPU is the buffer) and not doing anything (in
> > the best case that is, when we're not broken otherwise).
>
> Can the kernel detect whether anyone is listening to the
> persistent MCE event? If so, then the kernel could printk()
> something to let the user with no RAS daemon (or a dead
> daemon) that stuff is happening that they might like to
> know about.
>
> Probably make some sense to delay such a message (so that in
> the boot case we give the daemon a chance to get started before
> complaining that it hasn't shown up for work).
Yes, i definitely think a gateway to printk would be useful, so that the system
can log MCE events the syslog way as well. This probably makes sense for
persistent events in general, not just MCE events.
Btw., as a sidenote, the much more interesting direction is the reverse
direction: we want a gateway of printk into the RAS daemon as well - in form of
a special 'printk events' that contain:
- the log level of the kernel when the message was generated
- the log level of the message
- the printk timestamp
- plus the printk message itself, as a free-form string
This would allow RAS functionality to dispatch off printk events immediately
and transparently, without having to separately worry about how to talk to
syslogd/klogd how to get its logs ...
printk itself could become a persistent event. (Transparently and without
breaking compatible syslogd/klogd functionality.)
This would also allow the RAS daemon to log printk messages around suspicious
MCE events, in a time-serialized way via a single event channel - so post
mortem can be done using a single facility.
There's ongoing work to timestamp perf events with GTOD timestamps - that way
global log analysis becomes possible as well.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists