lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 2 Jan 2024 12:27:34 -0800
From: Ira Weiny <ira.weiny@...el.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
	<linux-efi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-cxl@...r.kernel.org>
CC: Ard Biesheuvel <ardb@...nel.org>, Alison Schofield
	<alison.schofield@...el.com>, Vishal Verma <vishal.l.verma@...el.com>, "Ira
 Weiny" <ira.weiny@...el.com>, Dan Williams <dan.j.williams@...el.com>,
	Jonathan Cameron <Jonathan.Cameron@...wei.com>, Yazen Ghannam
	<yazen.ghannam@....com>, Smita Koralahalli
	<Smita.KoralahalliChannabasappa@....com>
Subject: Re: [PATCH 4/4] acpi/ghes, cxl/pci: Trace FW-First CXL Protocol
 Errors

Smita Koralahalli wrote:
> When PCIe AER is in FW-First, OS should process CXL Protocol errors from
> CPER records. These CPER records obtained from GHES module, will rely on
> a registered callback to be notified to the CXL subsystem in order to be
> processed.
> 
> Call the existing cxl_cper_callback to notify the CXL subsystem on a
> Protocol error.
> 
> The defined trace events cxl_aer_uncorrectable_error and
> cxl_aer_correctable_error currently trace native CXL AER errors. Reuse
> them to trace FW-First Protocol Errors.
> 
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>

[snip]

>  int cxl_cper_register_callback(cxl_cper_callback callback)
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 37e1652afbc7..da516982a625 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -6,6 +6,7 @@
>  #include <linux/pci.h>
>  #include <linux/pci-doe.h>
>  #include <linux/aer.h>
> +#include <linux/cper.h>
>  #include <cxlpci.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
> @@ -836,6 +837,51 @@ void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_dport, CXL);
>  
> +#define CXL_AER_UNCORRECTABLE	0
> +#define CXL_AER_CORRECTABLE	1

Better defined as an enum?

> +
> +int cper_severity_cxl_aer(int cper_severity)

My gut says that it would be better to hide this conversion in the
GHES/CPER code and send a more generic defined CXL_AER_* severity through.

> +{
> +	switch (cper_severity) {
> +	case CPER_SEV_RECOVERABLE:
> +	case CPER_SEV_FATAL:
> +		return CXL_AER_UNCORRECTABLE;
> +	default:
> +		return CXL_AER_CORRECTABLE;
> +	}
> +}
> +
> +void cxl_prot_err_trace_record(struct cxl_dev_state *cxlds,
> +			       struct cxl_cper_rec_data *data)
> +{
> +	struct cper_cxl_event_sn *dev_serial_num =  &data->rec.hdr.dev_serial_num;
> +	u32 status, fe;
> +	int severity;
> +
> +	severity = cper_severity_cxl_aer(data->severity);
> +
> +	cxlds->serial = (((u64)dev_serial_num->upper_dw << 32) |
> +			dev_serial_num->lower_dw);

This permanently overwrites the serial number read from PCI...

If the serial number does not match up or was not valid (per the check in
the previous patch) lets add a warning.

AFAICT they should match.

Ira

[snip]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ