linux-kernel - Re: [PATCH] thunderbolt: Resume PCIe bridges after switch is found on AMD USB4 controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YxWlc1n4HRxawa/K@kroah.com>
Date:   Mon, 5 Sep 2022 09:29:55 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc:     mika.westerberg@...ux.intel.com, andreas.noever@...il.com,
        michael.jamet@...el.com, YehezkelShB@...il.com,
        sanju.mehta@....com, mario.limonciello@....com,
        linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thunderbolt: Resume PCIe bridges after switch is found
 on AMD USB4 controller

On Mon, Sep 05, 2022 at 02:56:22PM +0800, Kai-Heng Feng wrote:
> AMD USB4 can not detect external PCIe devices like external NVMe when
> it's hotplugged, because card/link are not up:
> 
> pcieport 0000:00:04.1: pciehp: pciehp_check_link_active: lnk_status = 1101

That sounds like a hardware bug, how does this work in other operating
systems for this hardware?

> Use `lspci` to resume pciehp bridges can find external devices.

That's not good :(

> A long delay before checking card/link presence doesn't help, either.
> The only way to make the hotplug work is to enable pciehp interrupt and
> check card presence after the TB switch is added.
> 
> Since the topology of USB4 and its PCIe bridges are siblings, hardcode
> the bridge ID so TBT driver can wake them up to check presence.

As I mention below, this is not an acceptable solution.

AMD developers, any ideas on how to get this fixed in the TB controller
firware instead?

> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216448
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
> ---
>  drivers/thunderbolt/nhi.c    | 29 +++++++++++++++++++++++++++++
>  drivers/thunderbolt/switch.c |  6 ++++++
>  drivers/thunderbolt/tb.c     |  1 +
>  drivers/thunderbolt/tb.h     |  5 +++++
>  include/linux/thunderbolt.h  |  1 +
>  5 files changed, 42 insertions(+)
> 
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index cb8c9c4ae93a2..75f5ce5e22978 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -1225,6 +1225,8 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct tb_nhi *nhi;
>  	struct tb *tb;
> +	struct pci_dev *p = NULL;
> +	struct tb_pci_bridge *pci_bridge, *n;
>  	int res;
>  
>  	if (!nhi_imr_valid(pdev)) {
> @@ -1306,6 +1308,19 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		nhi_shutdown(nhi);
>  		return res;
>  	}
> +
> +	if (pdev->vendor == PCI_VENDOR_ID_AMD) {
> +		while ((p = pci_get_device(PCI_VENDOR_ID_AMD, 0x14cd, p))) {
> +			pci_bridge = kmalloc(sizeof(struct tb_pci_bridge), GFP_KERNEL);
> +			if (!pci_bridge)
> +				goto cleanup;
> +
> +			pci_bridge->bridge = p;
> +			INIT_LIST_HEAD(&pci_bridge->list);
> +			list_add(&pci_bridge->list, &tb->bridge_list);
> +		}
> +	}

You can't walk the device tree and create a "shadow" list of devices
like this and expect any lifetime rules to work properly with them at
all.

Please do not do this.

greg k-h