lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 4 Mar 2024 12:10:24 -0800
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Ethan Zhao <haifeng.zhao@...ux.intel.com>, bhelgaas@...gle.com,
 lukas@...ner.de
Cc: Smita.KoralahalliChannabasappa@....com, ilpo.jarvinen@...ux.intel.com,
 linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, kbusch@...nel.org
Subject: Re: [PATCH pci-next] pci/edr: Ignore Surprise Down error on hot
 removal


On 3/4/24 1:08 AM, Ethan Zhao wrote:
> Per PCI firmware spec r3.3 sec 4.6.12, for firmware first mode DPC
> handling path, FW should clear UC errors logged by port and bring link
> out of DPC, but because of ambiguity of wording in the spec, some BIOSes
> doesn't clear the surprise down error and the error bits in pci status,

As Lukas mentioned, please include the hardware and BIOS version
where you see this issue.

> still notify OS to handle it. thus following trick is needed in EDR when
> double reporting (hot removal interrupt && dpc notification) is hit.

EDR notification is generally used when a firmware wants OS to invalidate
or recover the error state of child devices when handling a containment event.
Since this DPC event is a side effect of async removal, there is no recovery
involved. So there is no value in firmware notifying the OS via an ACPI notification
and then OS ignoring it.

If you check the PCIe firmware spec, sec 4.6.12, IMPLEMENTATION NOTE, it
recommends firmware to ignore the DPC due to hotplug surprise.

>
> https://patchwork.kernel.org/project/linux-pci/patch/20240207181854.
> 121335-1-Smita.KoralahalliChannabasappa@....com/
>
> Signed-off-by: Ethan Zhao <haifeng.zhao@...ux.intel.com>
> ---
>  drivers/pci/pci.h      | 1 +
>  drivers/pci/pcie/dpc.c | 9 +++++----
>  drivers/pci/pcie/edr.c | 3 +++
>  3 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 50134b5e3235..3787bb32e724 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -443,6 +443,7 @@ void pci_save_dpc_state(struct pci_dev *dev);
>  void pci_restore_dpc_state(struct pci_dev *dev);
>  void pci_dpc_init(struct pci_dev *pdev);
>  void dpc_process_error(struct pci_dev *pdev);
> +bool dpc_handle_surprise_removal(struct pci_dev *pdev);
>  pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
>  bool pci_dpc_recovered(struct pci_dev *pdev);
>  #else
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index 98b42e425bb9..be79f205e04c 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -319,8 +319,10 @@ static void pci_clear_surpdn_errors(struct pci_dev *pdev)
>  	pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
>  }
>  
> -static void dpc_handle_surprise_removal(struct pci_dev *pdev)
> +bool  dpc_handle_surprise_removal(struct pci_dev *pdev)
>  {
> +	if (!dpc_is_surprise_removal(pdev))
> +		return false;
>  	if (!pcie_wait_for_link(pdev, false)) {
>  		pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
>  		goto out;
> @@ -338,6 +340,7 @@ static void dpc_handle_surprise_removal(struct pci_dev *pdev)
>  out:
>  	clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
>  	wake_up_all(&dpc_completed_waitqueue);
> +	return true;
>  }
>  
>  static bool dpc_is_surprise_removal(struct pci_dev *pdev)
> @@ -362,10 +365,8 @@ static irqreturn_t dpc_handler(int irq, void *context)
>  	 * According to PCIe r6.0 sec 6.7.6, errors are an expected side effect
>  	 * of async removal and should be ignored by software.
>  	 */
> -	if (dpc_is_surprise_removal(pdev)) {
> -		dpc_handle_surprise_removal(pdev);
> +	if (dpc_handle_surprise_removal(pdev))
>  		return IRQ_HANDLED;
> -	}
>  
>  	dpc_process_error(pdev);
>  
> diff --git a/drivers/pci/pcie/edr.c b/drivers/pci/pcie/edr.c
> index 5f4914d313a1..556edfb2696a 100644
> --- a/drivers/pci/pcie/edr.c
> +++ b/drivers/pci/pcie/edr.c
> @@ -184,6 +184,9 @@ static void edr_handle_event(acpi_handle handle, u32 event, void *data)
>  		goto send_ost;
>  	}
>  
> +	if (dpc_handle_surprise_removal(edev))
> +		goto send_ost;
> +
>  	dpc_process_error(edev);
>  	pci_aer_raw_clear_status(edev);
>  
>
> base-commit: a66f2b4a4d365dc4bac35576f3a9d4f5982f1d63

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ