[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241017133839.000035dd@Huawei.com>
Date: Thu, 17 Oct 2024 13:38:39 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: <shiju.jose@...wei.com>
CC: <dave.jiang@...el.com>, <dan.j.williams@...el.com>,
<alison.schofield@...el.com>, <vishal.l.verma@...el.com>,
<ira.weiny@...el.com>, <dave@...olabs.net>, <linux-cxl@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linuxarm@...wei.com>,
<tanxiaofei@...wei.com>, <prime.zeng@...ilicon.com>
Subject: Re: [RFC PATCH 3/4] cxl/events: Updates for CXL DRAM Event Record
On Wed, 16 Oct 2024 17:33:48 +0100
<shiju.jose@...wei.com> wrote:
> From: Shiju Jose <shiju.jose@...wei.com>
>
> CXL spec 3.1 section 8.2.9.2.1.2 Table 8-46, DRAM Event Record has updated
> with following new fields and new types for Memory Event Type, Transaction
> Type and Validity Flags fields.
> 1. Component Identifier
> 2. Sub-channel
> 3. Advanced Programmable Corrected Memory Error Threshold Event Flags
> 4. Corrected Memory Error Count at Event
> 5. Memory Event Sub-Type
>
> Add updates for the above spec changes in the CXL events record and CXL
> DRAM trace event implementations.
>
> Signed-off-by: Shiju Jose <shiju.jose@...wei.com>
Passing comments on two things inline.
1) There are a couple of whitespace consistency changes in here.
Spaces to tabs for alignment. That's fine but maybe needs a brief
mention in the patch description.
2) Really odd that the spec didn't have a component ID field for DRAM
errors. They weren't all that useful before the PLDM format was added
but still a curiosity that made me open up the 3.0 spec. Indeed, no
such field.
With that one line added to the patch description this looks good to me.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> ---
> drivers/cxl/core/trace.h | 44 ++++++++++++++++++++++++++++++++--------
> include/cxl/event.h | 7 ++++++-
> 2 files changed, 42 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e638e82429bc..20790dffa2b4 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -468,7 +468,7 @@ TRACE_EVENT(cxl_general_media,
> /*
> * DRAM Event Record - DER
> *
> - * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + * CXL rev 3.1 section 8.2.9.2.1.2; Table 8-46
> */
> /*
> * DRAM Event Record defines many fields the same as the General Media Event
> @@ -478,11 +478,17 @@ TRACE_EVENT(cxl_general_media,
> #define CXL_DER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR 0x01
> #define CXL_DER_MEM_EVT_TYPE_INV_ADDR 0x02
> #define CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR 0x03
> -#define show_dram_mem_event_type(type) __print_symbolic(type, \
> +#define CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION 0x04
> +#define CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
> +#define CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
> +#define show_dram_mem_event_type(type) __print_symbolic(type, \
This change looks odd here but does print the line above into the
same formatting style as the other similar cases in the file.
Maybe worth a line in the patch description to say "Includes trivial consistency of white
space improvements" just to flag up that it was intentional.
> { CXL_DER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
> { CXL_DER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
> { CXL_DER_MEM_EVT_TYPE_INV_ADDR, "Invalid Address" }, \
> - { CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" } \
> + { CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" }, \
> + { CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
> + { CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
> + { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
> )
>
> #define CXL_DER_VALID_CHANNEL BIT(0)
> @@ -493,7 +499,10 @@ TRACE_EVENT(cxl_general_media,
> #define CXL_DER_VALID_ROW BIT(5)
> #define CXL_DER_VALID_COLUMN BIT(6)
> #define CXL_DER_VALID_CORRECTION_MASK BIT(7)
> -#define show_dram_valid_flags(flags) __print_flags(flags, "|", \
> +#define CXL_DER_VALID_COMPONENT BIT(8)
> +#define CXL_DER_VALID_COMPONENT_ID_FORMAT BIT(9)
> +#define CXL_DER_VALID_SUB_CHANNEL BIT(10)
> +#define show_dram_valid_flags(flags) __print_flags(flags, "|", \
As above this is a minor white space consistency change.
> { CXL_DER_VALID_CHANNEL, "CHANNEL" }, \
> { CXL_DER_VALID_RANK, "RANK" }, \
> { CXL_DER_VALID_NIBBLE, "NIBBLE" }, \
> @@ -501,7 +510,9 @@ TRACE_EVENT(cxl_general_media,
> { CXL_DER_VALID_BANK, "BANK" }, \
> { CXL_DER_VALID_ROW, "ROW" }, \
> { CXL_DER_VALID_COLUMN, "COLUMN" }, \
> - { CXL_DER_VALID_CORRECTION_MASK, "CORRECTION MASK" } \
> + { CXL_DER_VALID_CORRECTION_MASK, "CORRECTION MASK" }, \
> + { CXL_DER_VALID_COMPONENT, "COMPONENT" }, \
> + { CXL_DER_VALID_SUB_CHANNEL, "SUB CHANNEL" } \
> )
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index ea8cd44a52e9..7e98492c85df 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -71,7 +71,12 @@ struct cxl_event_dram {
> u8 row[3];
> u8 column[2];
> u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
> - u8 reserved[0x17];
> + u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
Odd that the general media had this field in 3.0 but DRAM didn't.
I checked though and indeed the case!
> + u8 sub_channel;
> + u8 cme_threshold_ev_flags;
> + u8 cvme_count[3];
> + u8 sub_type;
> + u8 reserved;
> } __packed;
>
> /*
Powered by blists - more mailing lists