lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3fe7b527-5030-c916-79fe-241bf37e4bab@linux.intel.com>
Date: Mon, 13 Jan 2025 17:08:24 +0200 (EET)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Jiwei Sun <jiwei.sun.bj@...com>
cc: macro@...am.me.uk, bhelgaas@...gle.com, linux-pci@...r.kernel.org, 
    LKML <linux-kernel@...r.kernel.org>, guojinhui.liam@...edance.com, 
    helgaas@...nel.org, Lukas Wunner <lukas@...ner.de>, ahuang12@...ovo.com, 
    sunjw10@...ovo.com
Subject: Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during
 hotplug testing

On Fri, 10 Jan 2025, Jiwei Sun wrote:

> From: Jiwei Sun <sunjw10@...ovo.com>
> 
> When we do the quick hot-add/hot-remove test (within 1 second) with a PCIE
> Gen 5 NVMe disk, there is a possibility that the PCIe bridge will decrease
> to 2.5GT/s from 32GT/s
> 
> pcieport 10002:00:04.0: pciehp: Slot(75): Link Down
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> ...
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: broken device, retraining non-functional downstream link at 2.5GT/s
> pcieport 10002:00:04.0: pciehp: Slot(75): No link
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): Link Up
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pcieport 10002:00:04.0: pciehp: Slot(75): No device found
> pcieport 10002:00:04.0: pciehp: Slot(75): Card present
> pci 10002:02:00.0: [144d:a826] type 00 class 0x010802 PCIe Endpoint
> pci 10002:02:00.0: BAR 0 [mem 0x00000000-0x00007fff 64bit]
> pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x00007fff 64bit]
> pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x001fffff 64bit]: contains BAR 0 for 64 VFs
> pci 10002:02:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 10002:00:04.0 (capable of 126.028 Gb/s with 32.0 GT/s PCIe x4 link)
> 
> If a NVMe disk is hot removed, the pciehp interrupt will be triggered, and
> the kernel thread pciehp_ist will be woken up, the
> pcie_failed_link_retrain() will be called as the following call trace.
> 
>    irq/87-pciehp-2524    [121] ..... 152046.006765: pcie_failed_link_retrain <-pcie_wait_for_link
>    irq/87-pciehp-2524    [121] ..... 152046.006782: <stack trace>
>  => [FTRACE TRAMPOLINE]
>  => pcie_failed_link_retrain
>  => pcie_wait_for_link
>  => pciehp_check_link_status
>  => pciehp_enable_slot
>  => pciehp_handle_presence_or_link_change
>  => pciehp_ist
>  => irq_thread_fn
>  => irq_thread
>  => kthread
>  => ret_from_fork
>  => ret_from_fork_asm
> 
> Accorind to investigation, the issue is caused by the following scenerios,
> 
> NVMe disk	pciehp hardirq
> hot-remove 	top-half		pciehp irq kernel thread
> ======================================================================
> pciehp hardirq
> will be triggered
> 	    	cpu handle pciehp
> 		hardirq
> 		pciehp irq kthread will
> 		be woken up
> 					pciehp_ist
> 					...
> 					  pcie_failed_link_retrain
> 					    read PCI_EXP_LNKCTL2 register
> 					    read PCI_EXP_LNKSTA register
> If NVMe disk
> hot-add before
> calling pcie_retrain_link()
> 					    set target speed to 2_5GT

This assumes LBMS has been seen but DLLLA isn't? Why is that?

> 					      pcie_bwctrl_change_speed
> 	  				        pcie_retrain_link

> 						: the retrain work will be
> 						  successful, because
> 						  pci_match_id() will be
> 						  0 in
> 						  pcie_failed_link_retrain()

There's no pci_match_id() in pcie_retrain_link() ?? What does that : mean?
I think the nesting level is wrong in your flow description?

I don't understand how retrain success relates to the pci_match_id() as 
there are two different steps in pcie_failed_link_retrain().

In step 1, pcie_failed_link_retrain() sets speed to 2.5GT/s if DLLLA=0 and 
LBMS has been seen. Why is that condition happening in your case? You 
didn't explain LBMS (nor DLLLA) in the above sequence so it's hard to 
follow what is going on here. LBMS in particular is of high interest here 
because I'm trying to understand if something should clear it on the 
hotplug side (there's already one call to clear it in remove_board()).

In step 2 (pcie_set_target_speed() in step 1 succeeded), 
pcie_failed_link_retrain() attempts to restore >2.5GT/s speed, this only 
occurs when pci_match_id() matches. I guess you're trying to say that step 
2 is not taken because pci_match_id() is not matching but the wording 
above is very confusing.

Overall, I failed to understand the scenario here fully despite trying to 
think it through over these few days.

> 						  the target link speed
> 						  field of the Link Control
> 						  2 Register will keep 0x1.
> 
> In order to fix the issue, don't do the retraining work except ASMedia
> ASM2824.
> 
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Reported-by: Adrian Huang <ahuang12@...ovo.com>
> Signed-off-by: Jiwei Sun <sunjw10@...ovo.com>
> ---
>  drivers/pci/quirks.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 605628c810a5..ff04ebd9ae16 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -104,6 +104,9 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
>  	u16 lnksta, lnkctl2;
>  	int ret = -ENOTTY;
>  
> +	if (!pci_match_id(ids, dev))
> +		return 0;
> +
>  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
>  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
>  		return ret;
> @@ -129,8 +132,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
>  	}
>  
>  	if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
> -	    (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> -	    pci_match_id(ids, dev)) {
> +	    (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT) {
>  		u32 lnkcap;
>  
>  		pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
> 

-- 
 i.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ