[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230601-cxl-cper-v3-0-0189d61f7956@intel.com>
Date: Wed, 01 Nov 2023 14:11:17 -0700
From: Ira Weiny <ira.weiny@...el.com>
To: Dan Williams <dan.j.williams@...el.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
Cc: Yazen Ghannam <yazen.ghannam@....com>,
Davidlohr Bueso <dave@...olabs.net>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ard Biesheuvel <ardb@...nel.org>, linux-efi@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org,
Ira Weiny <ira.weiny@...el.com>
Subject: [PATCH RFC v3 0/6] efi/cxl-cper: Report CPER CXL component events
through trace events
Series status/background
========================
This is another RFC version of processing the CXL CPER records through
the CXL trace mechanisms as Dan mentioned in [1].
This raises the cxl event structures to a core header and rearranges them
such that they can be shared most efficiently. Thus eliminating a
memcpy Smita noticed. Also BDF is used instead of serial number.
NOTE: I'm still fuzzy on which fields in the CPER record are correct to
find the BDF in the Linux code. It would be nice to double check those
for me.
The CPER code remains compile tested only. The original event code
continues to pass cxl-test.
[1] https://lore.kernel.org/all/6528808cef2ba_780ef294c5@dwillia2-xfh.jf.intel.com.notmuch/
Cover letter
============
CXL Component Events, as defined by EFI 2.10 Section N.2.14, wrap a
mostly CXL event payload in an EFI Common Platform Error Record (CPER)
record. If a device is configured for firmware first CXL event records
are not sent directly to the host.
The CXL sub-system uniquely has DPA to HPA translation information. It
also already properly decodes the event format. Send the CXL CPER
records to the CXL sub-system for processing.
With CXL event logs the device interrupts the host with events. In the
EFI case events are wrapped with device information which needs to be
matched with memdev devices the CXL driver is tracking.
A number of alternatives were considered to match the memdev with the
CPER record. The most robust was to find the PCI device via Bus,
Device, Function and match it to the memdev driver data.
CPER records are identified with GUID's while CXL event logs contain
UUID's. The UUID was previously printed for all events. But the UUID
is redundant information which presents unnecessary complexity when
processing CPER data. Remove the UUIDs from known events. Restructure
the code to make sharing the data between CPER/event logs most
efficient.
Signed-off-by: Ira Weiny <ira.weiny@...el.com>
---
Changes in RFC v3:
- djbw: Share structures between CPER/event logs
- Smita: use BDF to resolve the memdev
- djbw/Smita: various cleanups
- Link to v2: https://lore.kernel.org/r/20230601-cxl-cper-v2-0-314d9c36ab02@intel.com
---
Ira Weiny (6):
cxl/trace: Remove uuid from event trace known events
cxl/events: Promote CXL event structures to a core header
cxl/events: Remove UUID from non-generic event structures
cxl/events: Create a CXL event union
firmware/efi: Process CXL Component Events
cxl/memdev: Register for and process CPER events
drivers/cxl/core/mbox.c | 57 +++++++++-----
drivers/cxl/core/trace.h | 18 ++---
drivers/cxl/cxlmem.h | 96 ++---------------------
drivers/cxl/pci.c | 59 +++++++++++++-
drivers/firmware/efi/cper.c | 15 ++++
drivers/firmware/efi/cper_cxl.c | 40 ++++++++++
drivers/firmware/efi/cper_cxl.h | 29 +++++++
include/linux/cxl-event.h | 160 ++++++++++++++++++++++++++++++++++++++
tools/testing/cxl/test/mem.c | 166 +++++++++++++++++++++++-----------------
9 files changed, 451 insertions(+), 189 deletions(-)
---
base-commit: 1c8b86a3799f7e5be903c3f49fcdaee29fd385b5
change-id: 20230601-cxl-cper-26ffc839c6c6
Best regards,
--
Ira Weiny <ira.weiny@...el.com>
Powered by blists - more mailing lists