netdev - Re: [Intel-wired-lan] [PATCH] igc: Mask replay rollover/timeout errors in I225

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <249b621a-a527-6ee9-7802-acb34e91e433@intel.com>
Date:   Sun, 1 Jan 2023 09:27:24 +0200
From:   "Neftin, Sasha" <sasha.neftin@...el.com>
To:     Rajat Khandelwal <rajat.khandelwal@...ux.intel.com>,
        <jesse.brandeburg@...el.com>, <anthony.l.nguyen@...el.com>,
        <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
        <pabeni@...hat.com>, "Ruinskiy, Dima" <dima.ruinskiy@...el.com>,
        "Lifshits, Vitaly" <vitaly.lifshits@...el.com>,
        "Avivi, Amir" <amir.avivi@...el.com>
CC:     <netdev@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org>,
        <linux-kernel@...r.kernel.org>, <rajat.khandelwal@...el.com>,
        "Neftin, Sasha" <sasha.neftin@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH] igc: Mask replay rollover/timeout
 errors in I225_LMVP

On 12/29/2022 14:26, Rajat Khandelwal wrote:
> The CPU logs get flooded with replay rollover/timeout AER errors in
> the system with i225_lmvp connected, usually inside thunderbolt devices.
> 
> One of the prominent TBT4 docks we use is HP G4 Hook2, which incorporates
> an Intel Foxville chipset, which uses the igc driver.
> On connecting ethernet, CPU logs get inundated with these errors. The point
> is we shouldn't be spamming the logs with such correctible errors as it
> confuses other kernel developers less familiar with PCI errors, support
> staff, and users who happen to look at the logs.
> 
> Signed-off-by: Rajat Khandelwal <rajat.khandelwal@...ux.intel.com>
> ---
>   drivers/net/ethernet/intel/igc/igc_main.c | 28 +++++++++++++++++++++--
>   1 file changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index ebff0e04045d..a3a6e8086c8d 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -6201,6 +6201,26 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
>   	return value;
>   }
>   
> +#ifdef CONFIG_PCIEAER
> +static void igc_mask_aer_replay_correctible(struct igc_adapter *adapter)
> +{
> +	struct pci_dev *pdev = adapter->pdev;
> +	u32 aer_pos, corr_mask;
> +
> +	if (pdev->device != IGC_DEV_ID_I225_LMVP)
> +		return;
> +
> +	aer_pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
> +	if (!aer_pos)
> +		return;
> +
> +	pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_MASK, &corr_mask);
> +
> +	corr_mask |= PCI_ERR_COR_REP_ROLL | PCI_ERR_COR_REP_TIMER;
> +	pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_MASK, corr_mask);
> +}
> +#endif
> +
Hello Rajat,
May we use the privilege flag approach, give user control: and mask some 
advanced errors?
Although...  Why did it happen? Didn't you prefer not to investigate it 
or else mask it? (I have concerns about the PCIe link over the 
thunderbolt tunnel)
>   /**
>    * igc_probe - Device Initialization Routine
>    * @pdev: PCI device information struct
> @@ -6236,8 +6256,6 @@ static int igc_probe(struct pci_dev *pdev,
>   	if (err)
>   		goto err_pci_reg;
>   
> -	pci_enable_pcie_error_reporting(pdev);
> -
>   	err = pci_enable_ptm(pdev, NULL);
>   	if (err < 0)
>   		dev_info(&pdev->dev, "PCIe PTM not supported by PCIe bus/controller\n");
> @@ -6272,6 +6290,12 @@ static int igc_probe(struct pci_dev *pdev,
>   	if (!adapter->io_addr)
>   		goto err_ioremap;
>   
> +#ifdef CONFIG_PCIEAER
> +	igc_mask_aer_replay_correctible(adapter);
> +#endif
> +
> +	pci_enable_pcie_error_reporting(pdev);
> +
>   	/* hw->hw_addr can be zeroed, so use adapter->io_addr for unmap */
>   	hw->hw_addr = adapter->io_addr;
>