lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240620141514.00007c6d@Huawei.com>
Date: Thu, 20 Jun 2024 14:15:14 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Terry Bowman <terry.bowman@....com>
CC: <dan.j.williams@...el.com>, <ira.weiny@...el.com>, <dave@...olabs.net>,
	<dave.jiang@...el.com>, <alison.schofield@...el.com>, <ming4.li@...el.com>,
	<vishal.l.verma@...el.com>, <jim.harris@...sung.com>,
	<ilpo.jarvinen@...ux.intel.com>, <ardb@...nel.org>,
	<sathyanarayanan.kuppuswamy@...ux.intel.com>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <Yazen.Ghannam@....com>,
	<Robert.Richter@....com>
Subject: Re: [RFC PATCH 9/9] cxl/pci: Enable interrupts for CXL PCIe ports'
 AER internal errors

On Mon, 17 Jun 2024 15:04:11 -0500
Terry Bowman <terry.bowman@....com> wrote:

> CXL RAS errors are reported through AER interrupts using the AER status:
> correctbale internal errors (CIE) and AER uncorrectable internal errors

correctable

> (UIE).[1] But, the AER CIE/UIE are disabled by default preventing
> notification of CXL RAS errors.[2]
> 
> Enable CXL PCIe port RAS notification by unmasking the ports' AER CIE
> and UIE errors.
> 
> [1] CXL3.1 - 12.2.2 CXL Root Ports, Downstream Switch Ports, and Upstream
>              Switch Ports
> [2] PCI6.0 - 7.8.4.3 Uncorrectable Error Mask Register (Offset 08h),
>              7.8.4.6 Correctable Error Mask Register (Offset 14h)
> 
> Signed-off-by: Terry Bowman <terry.bowman@....com>

I'm not sure doing this from a driver other than the one handling the
errors makes sense.  It is doing a couple of RMW without any locking
or guarantees that the driver bound to the PCI port might care about
this changing.

I'd like more info on why we don't just turn this on in general
and hence avoid the need to control it from the 'wrong' place.

Jonathan



> ---
>  drivers/cxl/core/pci.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index e630eccb733d..73637d39df0a 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -861,6 +861,12 @@ void cxl_setup_parent_uport(struct device *host, struct cxl_port *port)
>  	struct device *uport_dev = port->uport_dev;
>  
>  	cxl_port_map_regs(uport_dev, map, regs);
> +
> +	if (dev_is_pci(uport_dev)) {
> +		struct pci_dev *pdev = to_pci_dev(uport_dev);
> +
> +		pci_aer_unmask_internal_errors(pdev);

I'd skip the local variable for conciseness.

> +	}
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_uport, CXL);
>  
> @@ -878,6 +884,12 @@ void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>  
>  	if (dport->rch)
>  		cxl_disable_rch_root_ints(dport);
> +
> +	if (dev_is_pci(dport_dev)) {
> +		struct pci_dev *pdev = to_pci_dev(dport_dev);
> +
> +		pci_aer_unmask_internal_errors(pdev);

likewise.

> +	}
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_dport, CXL);
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ