lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 24 Jun 2024 11:46:01 -0500
From: Terry Bowman <Terry.Bowman@....com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: dan.j.williams@...el.com, ira.weiny@...el.com, dave@...olabs.net,
 dave.jiang@...el.com, alison.schofield@...el.com, ming4.li@...el.com,
 vishal.l.verma@...el.com, jim.harris@...sung.com,
 ilpo.jarvinen@...ux.intel.com, ardb@...nel.org,
 sathyanarayanan.kuppuswamy@...ux.intel.com, linux-cxl@...r.kernel.org,
 linux-kernel@...r.kernel.org, Yazen.Ghannam@....com, Robert.Richter@....com
Subject: Re: [RFC PATCH 9/9] cxl/pci: Enable interrupts for CXL PCIe ports'
 AER internal errors

Hi Jonathan,

I added responses inline below.

On 6/20/24 08:15, Jonathan Cameron wrote:
> On Mon, 17 Jun 2024 15:04:11 -0500
> Terry Bowman <terry.bowman@....com> wrote:
> 
>> CXL RAS errors are reported through AER interrupts using the AER status:
>> correctbale internal errors (CIE) and AER uncorrectable internal errors
> 
> correctable
> 

Thanks.

>> (UIE).[1] But, the AER CIE/UIE are disabled by default preventing
>> notification of CXL RAS errors.[2]
>>
>> Enable CXL PCIe port RAS notification by unmasking the ports' AER CIE
>> and UIE errors.
>>
>> [1] CXL3.1 - 12.2.2 CXL Root Ports, Downstream Switch Ports, and Upstream
>>              Switch Ports
>> [2] PCI6.0 - 7.8.4.3 Uncorrectable Error Mask Register (Offset 08h),
>>              7.8.4.6 Correctable Error Mask Register (Offset 14h)
>>
>> Signed-off-by: Terry Bowman <terry.bowman@....com>
> 
> I'm not sure doing this from a driver other than the one handling the
> errors makes sense.  It is doing a couple of RMW without any locking
> or guarantees that the driver bound to the PCI port might care about
> this changing.
> 

I think this could fit into the helper function mentioned in our earlier 
discussion. When the portdrv's notifier enabler is called it could also
enable the UIE/CIE.

> I'd like more info on why we don't just turn this on in general
> and hence avoid the need to control it from the 'wrong' place.
> 
> Jonathan
> 

I was trying to enable only where needed given the one case is not a 
pattern, yet. At this point it is only for CXL RCH downstream port 
and CXL VH ports (portdrv).

Would you like for the UIE/CIE unmask added to the AER driver init ?

> 
> 
>> ---
>>  drivers/cxl/core/pci.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index e630eccb733d..73637d39df0a 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -861,6 +861,12 @@ void cxl_setup_parent_uport(struct device *host, struct cxl_port *port)
>>  	struct device *uport_dev = port->uport_dev;
>>  
>>  	cxl_port_map_regs(uport_dev, map, regs);
>> +
>> +	if (dev_is_pci(uport_dev)) {
>> +		struct pci_dev *pdev = to_pci_dev(uport_dev);
>> +
>> +		pci_aer_unmask_internal_errors(pdev);
> 
> I'd skip the local variable for conciseness.
> 
>> +	}
>>  }
>>  EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_uport, CXL);
>>  
>> @@ -878,6 +884,12 @@ void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>>  
>>  	if (dport->rch)
>>  		cxl_disable_rch_root_ints(dport);
>> +
>> +	if (dev_is_pci(dport_dev)) {
>> +		struct pci_dev *pdev = to_pci_dev(dport_dev);
>> +
>> +		pci_aer_unmask_internal_errors(pdev);
> 
> likewise.
> 

Got it.

Regards,
Terry

>> +	}
>>  }
>>  EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_dport, CXL);
>>  
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ