[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240422-cxl-cper3-v2-0-5cdd378fcd0b@intel.com>
Date: Mon, 22 Apr 2024 15:25:44 -0700
From: Ira Weiny <ira.weiny@...el.com>
To: Dave Jiang <dave.jiang@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
Shiju Jose <shiju.jose@...wei.com>
Cc: Dan Carpenter <dan.carpenter@...aro.org>,
Yazen Ghannam <yazen.ghannam@....com>, Davidlohr Bueso <dave@...olabs.net>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>, Ard Biesheuvel <ardb@...nel.org>,
linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-cxl@...r.kernel.org, Ira Weiny <ira.weiny@...el.com>,
"Rafael J. Wysocki" <rafael@...nel.org>, Tony Luck <tony.luck@...el.com>,
Borislav Petkov <bp@...en8.de>
Subject: [PATCH v2 0/3] efi/cxl-cper: Report CXL CPER events through
tracing
If a device is configured for firmware first CXL event records are not
sent directly to the host, rather they are reported through the EFI
Common Platform Error Records (CPER). EFI 2.10 Section N.2.14 defines
the CXL CPER to wrap a mostly CXL event payload.
The CXL sub-system uniquely has DPA to HPA translation information.[0]
It also already has event decoding/tracing. Such translations are very
useful for users to determine which system issues may correspond to
specific hardware events.
The restructuring of the event data structures in 6.8 made sharing the
data between CPER/event logs more efficient. Now re-wire the sending of
CPER records to the CXL sub-system.
In addition provide a default RAS event should the CXL module not be
loaded.
Series status/background
========================
Smita and Jonathan have been a great help with this series. Once again
thank you.
Unfortunately, with all the churn surrounding the bug which Dan
Carpenter found the maintainers were force to revert this work.
Testing
=======
A quick hack was added to debugfs patch to facilitate easier testing.[1]
With this it was verified that the bug Dan Carpenter found is fixed.
However, the tp_printk bug Jonathan found remains. Fortunately,
tp_printk is not widely used so it is anticipated this will not be an
issue.
[0]
Link: https://lore.kernel.org/all/cover.1711598777.git.alison.schofield@intel.com/
[1]
Link: https://github.com/weiny2/linux-kernel/commit/9b1f33314e8488506dbad63dc1c896386d4803d6
Signed-off-by: Ira Weiny <ira.weiny@...el.com>
---
Changes in v2:
- iweiny: address comments from V1 (noted in the patches themselves)
- iweiny: drop header file clean up patch (only needed for my debugfs test)
- Link to v1: https://lore.kernel.org/r/20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com
---
Ira Weiny (3):
acpi/ghes: Process CXL Component Events
cxl/pci: Process CPER events
ras/events: Trace CXL CPER events without CXL stack
drivers/acpi/apei/ghes.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/cxl/pci.c | 61 +++++++++++++++++++++-
include/linux/cxl-event.h | 18 +++++++
include/ras/ras_event.h | 51 ++++++++++++++++++
4 files changed, 257 insertions(+), 1 deletion(-)
---
base-commit: 4d2008430ce87061c9cefd4f83daf2d5bb323a96
change-id: 20240220-cxl-cper3-30e55279f936
Best regards,
--
Ira Weiny <ira.weiny@...el.com>
Powered by blists - more mailing lists