[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g80j4iFMXYDKek8VBYsa0g35avvw+UK6RxutcmxSX+WA@mail.gmail.com>
Date: Wed, 14 Jan 2026 17:11:45 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: "Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>
Cc: linux-cxl@...r.kernel.org, Rafael J Wysocki <rafael@...nel.org>,
Len Brown <lenb@...nel.org>, Tony Luck <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>,
Hanjun Guo <guohanjun@...wei.com>, Mauro Carvalho Chehab <mchehab@...nel.org>,
Shuai Xue <xueshuai@...ux.alibaba.com>, Davidlohr Bueso <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>, Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>, Dan Williams <dan.j.williams@...el.com>,
Mahesh J Salgaonkar <mahesh@...ux.ibm.com>, "Oliver O'Halloran" <oohall@...il.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org,
linux-acpi@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-pci@...r.kernel.org
Subject: Re: [PATCH 0/5 v9] Make ELOG and GHES log and trace consistently
On Wed, Jan 14, 2026 at 11:15 AM Fabio M. De Francesco
<fabio.m.de.francesco@...ux.intel.com> wrote:
>
> When Firmware First is enabled, BIOS handles errors first and then it
> makes them available to the kernel via the Common Platform Error Record
> (CPER) sections (UEFI 2.10 Appendix N). Linux parses the CPER sections
> via one of two similar paths, either ELOG or GHES.
>
> Currently, ELOG and GHES show some inconsistencies in how they print to
> the kernel log as well as in how they report to userspace via trace
> events.
>
> Make the two mentioned paths act similarly for what relates to logging
> and tracing.
>
> --- Changes for v9 ---
>
> - #include linux/printk.h for pr_*_ratelimited() in ghes_helpers.c
> Reported-by: kernel test robot <lkp@...el.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202512240711.Iv57ik8I-lkp@intel.com/
>
> --- Changes for v8 ---
>
> - Don't make GHES dependend on PCI and drop patch 3/6 -
> incidentally it works out the issues that the KTR found with v7
> (Jonathan, Hanjun)
> - Don't have EXTLOG dependend on CXL_BUS and move the new helpers
> to a new file, then link it to ghes.c only if ACPI_APEI_PCIEAER is
> selected. Placing the new helpers to their own translation unit seems
> be a more flexible and safer solution than messing with Kconfig or
> with conditional compilation macros within ghes.c. PCI may not be an
> option in embedded platforms
>
> --- Changes for v7 ---
>
> - Reference UEFI v2.11 (Sathyanarayanan)
> - Substitute !(A || B) with !(A && B) in an 'if' statement to
> convey the intended logic (Jonathan)
> - Make ACPI_APEI_GHES explicitly select PCIAER because the needed
> ACPI_APEI_PCIEAER doesn't recursively select that prerequisite (Jonathan)
> Reported-by: kernel test robot <lkp@...el.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202510232204.7aYBpl7h-lkp@intel.com/
> Closes: https://lore.kernel.org/oe-kbuild-all/202510232204.XIXgPWD7-lkp@intel.com/
> - Don't add the unnecessary cxl_cper_ras_handle_prot_err() wrapper
> for cxl_cper_handle_prot_err() (Jonathan)
> - Make ACPI_EXTLOG explicitly select PCIAER && ACPI_APEI because
> the needed ACPI_APEI_PCIEAER doesn't recursively select the
> prerequisites
> - Make ACPI_EXTLOG select CXL_BUS
>
> --- Changes for v6 ---
>
> - Rename the helper that copies the CPER CXL protocol error
> information to work struct (Dave)
> - Return -EOPNOTSUPP (instead of -EINVAL) from the two helpers if
> ACPI_APEI_PCIEAER is not defined (Dave)
>
> --- Changes for v5 ---
>
> - Add 3/6 to select ACPI_APEI_PCIEAER for GHES
> - Add 4,5/6 to move common code between ELOG and GHES out to new
> helpers use them in 6/6 (Jonathan).
>
> --- Changes for v4 ---
>
> - Re-base on top of recent changes of the AER error logging and
> drop obsoleted 2/4 (Sathyanarayanan)
> - Log with pr_warn_ratelimited() (Dave)
> - Collect tags
> --- Changes for v3 ---
>
> 1/4, 2/4:
> - collect tags; no functional changes
> 3/4:
> - Invert logic of checks (Yazen)
> - Select CONFIG_ACPI_APEI_PCIEAER (Yazen)
> 4/4:
> - Check serial number only for CXL devices (Yazen)
> - Replace "invalid" with "unknown" in the output of a pr_err()
> (Yazen)
>
> --- Changes for v2 ---
>
> - Add a patch to pass log levels to pci_print_aer() (Dan)
> - Add a patch to trace CPER CXL Protocol Errors
> - Rework commit messages (Dan)
> - Use log_non_standard_event() (Bjorn)
>
> --- Changes for v1 ---
>
> - Drop the RFC prefix and restart from PATCH v1
> - Drop patch 3/3 because a discussion on it has not yet been
> settled
> - Drop namespacing in export of pci_print_aer while() (Dan)
> - Don't use '#ifdef' in *.c files (Dan)
> - Drop a reference on pdev after operation is complete (Dan)
> - Don't log an error message if pdev is NULL (Dan)
>
> Fabio M. De Francesco (5):
> ACPI: extlog: Trace CPER Non-standard Section Body
> ACPI: extlog: Trace CPER PCI Express Error Section
> acpi/ghes: Add helper for CPER CXL protocol errors checks
> acpi/ghes: Add helper to copy CPER CXL protocol error info to work
> struct
> ACPI: extlog: Trace CPER CXL Protocol Error Section
>
> drivers/acpi/Kconfig | 2 +
> drivers/acpi/acpi_extlog.c | 64 +++++++++++++++++++++++++++++++
> drivers/acpi/apei/Makefile | 1 +
> drivers/acpi/apei/ghes.c | 40 +------------------
> drivers/acpi/apei/ghes_helpers.c | 66 ++++++++++++++++++++++++++++++++
> drivers/cxl/core/ras.c | 3 +-
> drivers/pci/pcie/aer.c | 2 +-
> include/cxl/event.h | 22 +++++++++++
> 8 files changed, 160 insertions(+), 40 deletions(-)
> create mode 100644 drivers/acpi/apei/ghes_helpers.c
>
>
> base-commit: b71e635feefc8
> --
Applied as 6.20 material, thanks!
Powered by blists - more mailing lists