lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Aug 2017 15:45:49 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     linux-edac <linux-edac@...r.kernel.org>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Tony Luck <tony.luck@...el.com>,
        Yazen Ghannam <Yazen.Ghannam@....com>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Mauro Carvalho Chehab <mchehab@...radead.org>
Subject: Re: [PATCH 0/7] EDAC, mce_amd: Issue decoded MCE through the
 tracepoint

On Fri, Aug 25, 2017 at 12:24:04PM +0200, Borislav Petkov wrote:
> Next step is adding that to rasdaemon.

Ok, below is the dirty version of the changes that need to go into
rasdaemon, I'll clean that up later. With it, I get:

# ./rasdaemon -f
overriding event (541) ras:mc_event with new print handler
rasdaemon: ras:mc_event event enabled
rasdaemon: Enabled event ras:mc_event
overriding event (58) mce:mce_record with new print handler
rasdaemon: mce:mce_record event enabled
rasdaemon: Enabled event mce:mce_record
overriding event (542) ras:extlog_mem_event with new print handler
rasdaemon: ras:extlog_mem_event event enabled
rasdaemon: Enabled event ras:extlog_mem_event
rasdaemon: Listening to events for cpus 0 to 7
cpu 07:           <...>-104   [1433913776]     0.000006: mce_record:           2017-08-28 17:41:01 +0200 bank=4, status= 9c7d410092080813, MC4 Error (node 2): DRAM ECC error detected on the NB.
, cpu_type= generic CPU, cpu= 2, socketid= 0, misc= 0, addr= 6d3d483b, , apicid= 0

and looking at it now, I don't need that "MC%d Error..:" thing either.

All queued for the next version.

---
diff --git a/ras-mce-handler.c b/ras-mce-handler.c
index 2e520d3663ac..ff6f4b373e56 100644
--- a/ras-mce-handler.c
+++ b/ras-mce-handler.c
@@ -23,6 +23,7 @@
 #include <unistd.h>
 #include <stdint.h>
 #include "libtrace/kbuffer.h"
+#include "libtrace/event-utils.h"
 #include "ras-mce-handler.h"
 #include "ras-record.h"
 #include "ras-logger.h"
@@ -185,6 +186,10 @@ static int detect_cpu(struct ras_events *ras)
 	ret = 0;
 
 	if (!strcmp(mce->vendor, "AuthenticAMD")) {
+
+		ret = 0;
+		goto ret;
+
 		if (mce->family == 15)
 			mce->cputype = CPU_K8;
 		if (mce->family > 15) {
@@ -357,8 +362,9 @@ int ras_mce_event_handler(struct trace_seq *s,
 	unsigned long long val;
 	struct ras_events *ras = context;
 	struct mce_priv *mce = ras->mce_priv;
+	const char *decoded_mce;
 	struct mce_event e;
-	int rc = 0;
+	int rc = 0, len;
 
 	memset(&e, 0, sizeof(e));
 
@@ -422,6 +428,10 @@ int ras_mce_event_handler(struct trace_seq *s,
 	if (rc)
 		return rc;
 
+	decoded_mce = pevent_get_field_raw(s, event, "decoded_str", record, &len, 1);
+	if (decoded_mce)
+		strncpy(e.error_msg, decoded_mce, min(len, 4096));
+
 	if (!*e.error_msg && *e.mcastatus_msg)
 		mce_snprintf(e.error_msg, "%s", e.mcastatus_msg);
 
-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ