lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230622090403.GA21721@wunner.de>
Date:   Thu, 22 Jun 2023 11:04:03 +0200
From:   Lukas Wunner <lukas@...ner.de>
To:     Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
Cc:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>, oohall@...il.com,
        Mahesh J Salgaonkar <mahesh@...ux.ibm.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Yazen Ghannam <yazen.ghannam@....com>,
        Fontenot Nathan <Nathan.Fontenot@....com>
Subject: Re: [PATCH v3 1/2] PCI: pciehp: Add support for async hotplug with
 native AER and DPC/EDR

On Wed, Jun 21, 2023 at 06:51:51PM +0000, Smita Koralahalli wrote:
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -292,10 +292,68 @@ void dpc_process_error(struct pci_dev *pdev)
>  	}
>  }
>  
> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
> +{
> +	u16 reg16;
> +	u32 reg32;
> +
> +	pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
> +	pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);

Make this read+write conditional on "if (pdev->dpc_rp_extensions)"
as the register otherwise doesn't exist.

Wrap to 80 chars per line.


> +	pci_read_config_word(pdev, PCI_STATUS, &reg16);
> +	pci_write_config_word(pdev, PCI_STATUS, reg16);
> +
> +	pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_FED);
> +}

A code comment might be useful here saying that in practice,
Surprise Down errors have been observed to also set error bits
in the Status Register as well as the Fatal Error Detected bit
in the Device Status Register.


> +static void dpc_handle_surprise_removal(struct pci_dev *pdev)
> +{

I'm wondering if we also need

	if (!pcie_wait_for_link(pdev, false)) {
		pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
		goto out;
	}

here, similar to dpc_reset_link() and in accordance with PCIe r6.1
sec 6.2.11:

	"To ensure that the LTSSM has time to reach the Disabled state
	or at least to bring the Link down under a variety of error
	conditions, software must leave the Downstream Port in DPC
	until the Data Link Layer Link Active bit in the Link Status
	Register reads 0b; otherwise, the result is undefined."


> +	if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) {
> +		pci_err(pdev, "failed to retrieve DPC root port on async remove\n");
> +		goto out;
> +	}

I don't think pci_err() is needed here as dpc_wait_rp_inactive()
already emits a message.  (I think I mistakenly gave the advice
to emit an error here in an earlier review comment -- sorry!)


> +
> +	pci_aer_raw_clear_status(pdev);
> +	pci_clear_surpdn_errors(pdev);
> +
> +	pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
> +			      PCI_EXP_DPC_STATUS_TRIGGER);
> +
> +out:
> +	clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags);
> +	wake_up_all(&dpc_completed_waitqueue);
> +}
> +
> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
> +{
> +	u16 status;
> +
> +	pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);

Wrap to 80 chars.


> +
> +	if (!pdev->is_hotplug_bridge)
> +		return false;

Move this if-clause to the beginning if the function so that
you omit the unnecessary register read if it's not a hotplug
bridge.


> +
> +	if (!(status & PCI_ERR_UNC_SURPDN))
> +		return false;
> +
> +	return true;
> +}
> +
>  static irqreturn_t dpc_handler(int irq, void *context)
>  {
>  	struct pci_dev *pdev = context;
>  
> +	/*
> +	 * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
> +	 * removal might be unexpected, errors might be reported as a side
> +	 * effect of the event and software should handle them as an expected
> +	 * part of this event.
> +	 */

I think the usual way to reference the spec is "PCIe r6.0 sec 6.7.6".

Maybe that's just me but I find the code comment a little difficult
to parse.  Maybe something like the following?

	/*
	 * According to PCIe r6.0 sec 6.7.6, errors are an expected side effect
	 * of async removal and should be ignored by software.
	 */

Thanks,

Lukas

> +	if (dpc_is_surprise_removal(pdev)) {
> +		dpc_handle_surprise_removal(pdev);
> +		return IRQ_HANDLED;
> +	}
> +
>  	dpc_process_error(pdev);
>  
>  	/* We configure DPC so it only triggers on ERR_FATAL */
> -- 
> 2.17.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ