lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F31D41E37@ORSMSX106.amr.corp.intel.com>
Date:	Fri, 18 Oct 2013 20:57:22 +0000
From:	"Luck, Tony" <tony.luck@...el.com>
To:	Borislav Petkov <bp@...en8.de>,
	"Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>
CC:	"Chen, Gong" <gong.chen@...ux.intel.com>,
	"joe@...ches.com" <joe@...ches.com>,
	"m.chehab@...sung.com" <m.chehab@...sung.com>,
	"arozansk@...hat.com" <arozansk@...hat.com>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 4/9] ACPI, x86: Extended error log driver for x86
 platform

> Hmm, that's a good question you raise: but the more important question
> is, do you guys - Gong and Tony - want to replace the logging we're
> already doing, i.e. mce_log() with extlog or not.

Long term ... I'd be happy to see mce_log() go away.  But we need to have
a robust, well tested replacement in place for some time before such a
move is up for discussion.

> Because if you want to replace the current logging you actually have to
> exit machine_check_poll() after having done mce_ext_err_print() so that
> the rest of the chain doesn't see the error.

Yes - double error reporting should be avoided.

> And, does mce_ext_err_print only report DRAM ECC errors or other error
> types too?

Our first platforms to implement this only do so for memory errors.  This
could change in the future (the UEFI appendix N error record has defined
sub-sections for lots of types of errors).

Currently EDAC hooked into the mce even notification chain provides a
return code to indicate whether it completely processed the error, or
whether to fall through to the rest of mce_log():

	if (ret == NOTIFY_STOP)
		return;

Having both EDAC and this new extended error log both registered on this
chain would probably not be helpful in most cases.  Not sure if we should
handle that with user education to not load both an EDAC and ext_log driver
or if there should be some enforcement.

> Btw, if we keep both, then we're going to have two tracepoints -
> trace_mce_record() in mce_log() and this one - issuing each a record for
> the same event. Which is not really what we want I'd say...

trace_mce_record() dumps the raw data from the machine check banks.
I think there may still be a case for having this.  Analysis tools that look at
this trace as well should be smart enough to connect the dots.

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ