[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1327764771-28649-17-git-send-email-mchehab@redhat.com>
Date: Sat, 28 Jan 2012 13:32:51 -0200
From: Mauro Carvalho Chehab <mchehab@...hat.com>
To: unlisted-recipients:; (no To-header on input)
Cc: Mauro Carvalho Chehab <mchehab@...hat.com>,
Linux Edac Mailing List <linux-edac@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [PATCH RFCv2 16/16] edac: Add an error scope logic
This patch is currently incomplete, but the idea here is to
change the EDAC error calls to handle a scope var, that will
be used when providing the error traces to userspace, and
to increment a per-location counter.
Signed-off-by: Mauro Carvalho Chehab <mchehab@...hat.com>
---
include/linux/edac.h | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5876675..879116e 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -72,6 +72,33 @@ enum hw_event_mc_err_type {
HW_EVENT_ERR_FATAL,
};
+/**
+ * enum hw_event_error_scope - escope of a memory error
+ * @HW_EVENT_ERR_MC: error can be anywhere inside the MC
+ * @HW_EVENT_SCOPE_MC_BRANCH: error can be on any DIMM inside the branch
+ * @HW_EVENT_SCOPE_MC_CHANNEL: error can be on any DIMM inside the MC channel
+ * @HW_EVENT_SCOPE_MC_CSROW: error can be on any DIMM inside the csrow
+ * @HW_EVENT_SCOPE_MC_DIMM: error is on a specific DIMM
+ *
+ * Depending on the error detection algorithm, the memory topology and even
+ * the MC capabilities, some errors can't be attributed to just one DIMM, but
+ * to a group of memory sockets. Depending on where the error occurs, the
+ * EDAC core will increment the corresponding error count for that entity,
+ * and the upper entities. For example, assuming a system with 1 memory
+ * controller 2 branches, 2 MC channels and 4 DIMMS on it, if an error
+ * happens at channel 0, the error counts for channel 0, for branch 0 and
+ * for the memory controller 0 will be incremented. The DIMM error counts won't
+ * be incremented, as, in this example, the driver can't be 100% sure on what
+ * memory the error actually occurred.
+ */
+enum hw_event_error_scope {
+ HW_EVENT_SCOPE_MC,
+ HW_EVENT_SCOPE_MC_BRANCH,
+ HW_EVENT_SCOPE_MC_CHANNEL,
+ HW_EVENT_SCOPE_MC_CSROW,
+ HW_EVENT_SCOPE_MC_CSROW_CHANNEL,
+};
+
/* memory types */
enum mem_type {
MEM_EMPTY = 0, /* Empty csrow */
--
1.7.8
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists