[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100127163651.GF12522@basil.fritz.box>
Date: Wed, 27 Jan 2010 17:36:52 +0100
From: Andi Kleen <andi@...stfloor.org>
To: Mauro Carvalho Chehab <mchehab@...hat.com>
Cc: Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>,
Borislav Petkov <petkovbb@...glemail.com>, mingo@...hat.com,
hpa@...or.com, linux-kernel@...r.kernel.org, tglx@...utronix.de,
Andreas Herrmann <andreas.herrmann3@....com>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
linux-tip-commits@...r.kernel.org,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Fr??d??ric Weisbecker <fweisbec@...il.com>,
Aristeu Rozanski <aris@...hat.com>,
Doug Thompson <norsk5@...oo.com>,
Huang Ying <ying.huang@...el.com>,
Arjan van de Ven <arjan@...radead.org>
Subject: Re: [tip:x86/mce] x86, mce: Rename cpu_specific_poll to
mce_cpu_specific_poll
On Wed, Jan 27, 2010 at 01:04:55PM -0200, Mauro Carvalho Chehab wrote:
> I haven't the datasheets for 75xx, so I can't say for sure if it would be better to
> use the same driver or to fork it.
You can't use the same driver.
> Well, the error parsing can be done in kernel space in a standard way provided
> by the edac interface.
>
> I don't see why not the mcelog userspace shouldn't use the EDAC interface as one
> of its source, getting memory errors from it, avoiding the need of re-parsing
> the errors.
The errors are just numbers which are printed. If you mean with "parsing"
splitting up the bitfields that's not really an too interesting case.
Essentially to get terminology clear, for corrected errors there are multiple
steps: (uncorrected errors are quite different)
1) Getting the error from hardware registers
2) Accounting them
3) Presenting them to users
4) Reacting to events
which can be separated in
4a) protocol to communicate with event handler
4a1) interface to wake up event handler
4a2) communication
4b) event handler itself
Some parts of these need to be in kernel space: but that's
pretty much only (1)
Some parts of these need to be in user space: in particular
4b) and (3) for any non trivial presentation (the kernel can
do some very limited one, but it's not good at anything non trivial
here)
4b needs to be in user space, it's deep policies and most interesting
advanced reactions to errors cannot be done in kernel space alone.
i7core does (1), some of (2) but not complete and 4a)
I don't really count EDAC as (3) because fishing the numbers out of
sysfs by hand is not user friendly. In EDAC that's typically done
with the EDAC utils, which are user space.
EDAC doesn't really solve 4a1) unless you could "written a syslog scanner"
in it.
The xeon75xx mce driver only does (1) and uses the standard
MCE event passing mechanism (4a) to pass it to mcelog.
mcelog just does the other parts, most of which have to be in user space
anyways.
The only thing you could probably argue is if it should do accounting
or not. Right now it does it and EDAC does it too. At least
for advanced accounting (per page) where you can have a lot of
data (can be larger than a struct page per 4K page)
I personally prefer that to be swappable.
Hope this helps,
-Andi
--
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists