[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0fbde50ced8a478aaa4aabd04cb7cb8a@huawei.com>
Date: Fri, 22 Nov 2024 10:41:14 +0000
From: Shiju Jose <shiju.jose@...wei.com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>, "mchehab@...nel.org"
<mchehab@...nel.org>, "dave.jiang@...el.com" <dave.jiang@...el.com>,
"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"alison.schofield@...el.com" <alison.schofield@...el.com>,
"nifan.cxl@...il.com" <nifan.cxl@...il.com>, "vishal.l.verma@...el.com"
<vishal.l.verma@...el.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
"dave@...olabs.net" <dave@...olabs.net>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Linuxarm <linuxarm@...wei.com>, tanxiaofei
<tanxiaofei@...wei.com>, "Zengtao (B)" <prime.zeng@...ilicon.com>
Subject: RE: [PATCH 13/13] rasdaemon: ras-mc-ctl: Update logging of CXL memory
module data to align with CXL spec rev 3.1
Hi Jonathan,
>-----Original Message-----
>From: Jonathan Cameron <jonathan.cameron@...wei.com>
>Sent: 21 November 2024 15:39
>To: Shiju Jose <shiju.jose@...wei.com>
>Cc: linux-edac@...r.kernel.org; linux-cxl@...r.kernel.org;
>mchehab@...nel.org; dave.jiang@...el.com; dan.j.williams@...el.com;
>alison.schofield@...el.com; nifan.cxl@...il.com; vishal.l.verma@...el.com;
>ira.weiny@...el.com; dave@...olabs.net; linux-kernel@...r.kernel.org;
>Linuxarm <linuxarm@...wei.com>; tanxiaofei <tanxiaofei@...wei.com>;
>Zengtao (B) <prime.zeng@...ilicon.com>
>Subject: Re: [PATCH 13/13] rasdaemon: ras-mc-ctl: Update logging of CXL
>memory module data to align with CXL spec rev 3.1
>
>On Wed, 20 Nov 2024 09:59:23 +0000
><shiju.jose@...wei.com> wrote:
>
>> From: Shiju Jose <shiju.jose@...wei.com>
>>
>> CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event
>> Record has updated with following new fields and new info for Device
>> Event Type and Device Health Information fields.
>> 1. Validity Flags
>> 2. Component Identifier
>> 3. Device Event Sub-Type
>>
>> This update modifies ras-mc-ctl to parse and log CXL memory module
>> event data stored in the RAS SQLite database table, reflecting the
>> specification changes introduced in revision 3.1.
>>
>> Example output,
>>
>> ./util/ras-mc-ctl --errors
>> ...
>> CXL memory module events:
>> 1 2024-11-20 00:22:33 +0000 error: memdev=mem0, host=0000:0f:00.0,
>> serial=0x3, \ log=Fatal,
>> hdr_uuid=fe927475-dd59-4339-a586-79bab113b774, hdr_flags=0x1, , \
>> hdr_handle=0x1, hdr_related_handle=0x0, hdr_timestamp=1970-01-01
>> 00:04:38 +0000, \ hdr_length=128, hdr_maint_op_class=0,
>> hdr_maint_op_sub_class=1, \
>> event_type: Temperature Change, event_sub_type: Unsupported Config
>> Data, \
>> health_status: 'MAINTENANCE_NEEDED' , 'REPLACEMENT_NEEDED' , \
>> media_status: All Data Loss in Event of Power Loss, life_used=8, \
>> dirty_shutdown_cnt=33, cor_vol_err_cnt=25, cor_per_err_cnt=45, \
>> device_temp=3, add_status=3 \
>> component_id:02 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \
>> pldm_entity_id:00 00 00 00 00 00 pldm_resource_id:fc d2 7e 2f ...
>>
>> Signed-off-by: Shiju Jose <shiju.jose@...wei.com>
>Feels like there is a lot of duplication in here, but you aren't really making it any
>worse and maybe it is hard to reduce it.
>
ras-mc-ctl is a tool(script), used offline, to read, decode and print the error event's data stored
by rasdaemon into the SQLite data base. Thus logging here is similar to those done in the rasdaemon.
>Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Thanks,
Shiju
Powered by blists - more mailing lists