[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170828134549.wh6toneoca47ff2w@pd.tnic>
Date: Mon, 28 Aug 2017 15:45:49 +0200
From: Borislav Petkov <bp@...en8.de>
To: linux-edac <linux-edac@...r.kernel.org>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Tony Luck <tony.luck@...el.com>,
Yazen Ghannam <Yazen.Ghannam@....com>, X86 ML <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Mauro Carvalho Chehab <mchehab@...radead.org>
Subject: Re: [PATCH 0/7] EDAC, mce_amd: Issue decoded MCE through the
tracepoint
On Fri, Aug 25, 2017 at 12:24:04PM +0200, Borislav Petkov wrote:
> Next step is adding that to rasdaemon.
Ok, below is the dirty version of the changes that need to go into
rasdaemon, I'll clean that up later. With it, I get:
# ./rasdaemon -f
overriding event (541) ras:mc_event with new print handler
rasdaemon: ras:mc_event event enabled
rasdaemon: Enabled event ras:mc_event
overriding event (58) mce:mce_record with new print handler
rasdaemon: mce:mce_record event enabled
rasdaemon: Enabled event mce:mce_record
overriding event (542) ras:extlog_mem_event with new print handler
rasdaemon: ras:extlog_mem_event event enabled
rasdaemon: Enabled event ras:extlog_mem_event
rasdaemon: Listening to events for cpus 0 to 7
cpu 07: <...>-104 [1433913776] 0.000006: mce_record: 2017-08-28 17:41:01 +0200 bank=4, status= 9c7d410092080813, MC4 Error (node 2): DRAM ECC error detected on the NB.
, cpu_type= generic CPU, cpu= 2, socketid= 0, misc= 0, addr= 6d3d483b, , apicid= 0
and looking at it now, I don't need that "MC%d Error..:" thing either.
All queued for the next version.
---
diff --git a/ras-mce-handler.c b/ras-mce-handler.c
index 2e520d3663ac..ff6f4b373e56 100644
--- a/ras-mce-handler.c
+++ b/ras-mce-handler.c
@@ -23,6 +23,7 @@
#include <unistd.h>
#include <stdint.h>
#include "libtrace/kbuffer.h"
+#include "libtrace/event-utils.h"
#include "ras-mce-handler.h"
#include "ras-record.h"
#include "ras-logger.h"
@@ -185,6 +186,10 @@ static int detect_cpu(struct ras_events *ras)
ret = 0;
if (!strcmp(mce->vendor, "AuthenticAMD")) {
+
+ ret = 0;
+ goto ret;
+
if (mce->family == 15)
mce->cputype = CPU_K8;
if (mce->family > 15) {
@@ -357,8 +362,9 @@ int ras_mce_event_handler(struct trace_seq *s,
unsigned long long val;
struct ras_events *ras = context;
struct mce_priv *mce = ras->mce_priv;
+ const char *decoded_mce;
struct mce_event e;
- int rc = 0;
+ int rc = 0, len;
memset(&e, 0, sizeof(e));
@@ -422,6 +428,10 @@ int ras_mce_event_handler(struct trace_seq *s,
if (rc)
return rc;
+ decoded_mce = pevent_get_field_raw(s, event, "decoded_str", record, &len, 1);
+ if (decoded_mce)
+ strncpy(e.error_msg, decoded_mce, min(len, 4096));
+
if (!*e.error_msg && *e.mcastatus_msg)
mce_snprintf(e.error_msg, "%s", e.mcastatus_msg);
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists