lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAd53p5cz0VWUH9Rdvk70pcpY-PLc9SV8UCvMEc0+TBGES5W5w@mail.gmail.com>
Date:   Thu, 8 Sep 2022 22:02:34 +0800
From:   Kai-Heng Feng <kai.heng.feng@...onical.com>
To:     "Limonciello, Mario" <Mario.Limonciello@....com>
Cc:     "mika.westerberg@...ux.intel.com" <mika.westerberg@...ux.intel.com>,
        "andreas.noever@...il.com" <andreas.noever@...il.com>,
        "michael.jamet@...el.com" <michael.jamet@...el.com>,
        "YehezkelShB@...il.com" <YehezkelShB@...il.com>,
        "Mehta, Sanju" <Sanju.Mehta@....com>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Tsao, Anson" <anson.tsao@....com>,
        Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [PATCH] thunderbolt: Resume PCIe bridges after switch is found on
 AMD USB4 controller

"

On Thu, Sep 8, 2022 at 12:30 AM Limonciello, Mario
<Mario.Limonciello@....com> wrote:
>
> [Public]
>
> Hi,
>
> > -----Original Message-----
> > From: Greg KH <gregkh@...uxfoundation.org>
> > Sent: Monday, September 5, 2022 02:30
> > To: Kai-Heng Feng <kai.heng.feng@...onical.com>
> > Cc: mika.westerberg@...ux.intel.com; andreas.noever@...il.com;
> > michael.jamet@...el.com; YehezkelShB@...il.com; Mehta, Sanju
> > <Sanju.Mehta@....com>; Limonciello, Mario
> > <Mario.Limonciello@....com>; linux-usb@...r.kernel.org; linux-
> > kernel@...r.kernel.org
> > Subject: Re: [PATCH] thunderbolt: Resume PCIe bridges after switch is found
> > on AMD USB4 controller
> >
> > On Mon, Sep 05, 2022 at 02:56:22PM +0800, Kai-Heng Feng wrote:
> > > AMD USB4 can not detect external PCIe devices like external NVMe when
> > > it's hotplugged, because card/link are not up:
> > >
> > > pcieport 0000:00:04.1: pciehp: pciehp_check_link_active: lnk_status = 1101
> >
> > That sounds like a hardware bug, how does this work in other operating
> > systems for this hardware?
>
> We happen to have this HP system in our lab.  My colleague Anson (now on CC) flashed
> the same BIOS to it (01.02.01) using dediprog and loaded a 6.0-rc3 mainline kernel built
> from the Canonical mainline kernel PPA.
>
> He then tried to hotplug a TBT3 SSD a number of times but couldn't hit this issue.
> I attached his log to the kernel Bugzilla.

Nice to hear. Hopefully this can be fixed at firmware/hardware side.

>
> >
> > > Use `lspci` to resume pciehp bridges can find external devices.
> >
> > That's not good :(
> >
> > > A long delay before checking card/link presence doesn't help, either.
> > > The only way to make the hotplug work is to enable pciehp interrupt and
> > > check card presence after the TB switch is added.
> > >
> > > Since the topology of USB4 and its PCIe bridges are siblings, hardcode
> > > the bridge ID so TBT driver can wake them up to check presence.
> >
> > As I mention below, this is not an acceptable solution.
> >
> > AMD developers, any ideas on how to get this fixed in the TB controller
> > firware instead?
>
> Anson also double checked on the AMD reference hardware that the HP system is built
> against and couldn't reproduce it there either.
>
> KH, I've got a few questions/comments to try to better explain why we're here.
>
> 1) How did you flash the 01.02.01 firmware?  In Anson's check, he used dediprog.
> Is it possible there was some stateful stuff used by HP's BIOS still on the SPI from the
> upgrade that didn't get set/cleared properly from an earlier pre-release BIOS?

We used UEFI capsule to update the firmware, via fwupd.

>
> 2) Did you change any BIOS settings?  Particularly anything to do with Pre-OS CM?

No, nothing in BIOS was changed.

>
> 3) If you explicitly reset to HP's "default BIOS settings" does it resolve?

Doesn't help. I put the device to ACPI G3 and it doesn't help, either.

>
> 4) Can you double check ADP_CS_5 bit 31?  I attached is a patch to kernel Bugzilla to
> add dyndbg output for it.  If it was for some reason set by Pre-OS CM in your BIOS/settings
> combination, we might need to undo it by the Linux CM.

All ports say "Hotplug disabled: 0".

dmesg attached to the bugzilla.

>
> 5) Are you changing any of the default runtime PM policies for any of the USB4 routers or
> root ports used for tunneling using software like TLP?

No. And they should be suspended by default.

Kai-Heng

>
> >
> > >
> > > Bugzilla:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> > illa.kernel.org%2Fshow_bug.cgi%3Fid%3D216448&amp;data=05%7C01%7Cm
> > ario.limonciello%40amd.com%7C1e27b1d6f69e42796c7b08da8f107121%7C3d
> > d8961fe4884e608e11a82d994e183d%7C0%7C0%7C637979598042186185%7CU
> > nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
> > 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0lhcaKfUyoK
> > 0FXT9uDZ8a%2Fpxs9tHd8aoQcyPFdB%2F0eY%3D&amp;reserved=0
> > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
> > > ---
> > >  drivers/thunderbolt/nhi.c    | 29 +++++++++++++++++++++++++++++
> > >  drivers/thunderbolt/switch.c |  6 ++++++
> > >  drivers/thunderbolt/tb.c     |  1 +
> > >  drivers/thunderbolt/tb.h     |  5 +++++
> > >  include/linux/thunderbolt.h  |  1 +
> > >  5 files changed, 42 insertions(+)
> > >
> > > diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> > > index cb8c9c4ae93a2..75f5ce5e22978 100644
> > > --- a/drivers/thunderbolt/nhi.c
> > > +++ b/drivers/thunderbolt/nhi.c
> > > @@ -1225,6 +1225,8 @@ static int nhi_probe(struct pci_dev *pdev, const
> > struct pci_device_id *id)
> > >  {
> > >     struct tb_nhi *nhi;
> > >     struct tb *tb;
> > > +   struct pci_dev *p = NULL;
> > > +   struct tb_pci_bridge *pci_bridge, *n;
> > >     int res;
> > >
> > >     if (!nhi_imr_valid(pdev)) {
> > > @@ -1306,6 +1308,19 @@ static int nhi_probe(struct pci_dev *pdev, const
> > struct pci_device_id *id)
> > >             nhi_shutdown(nhi);
> > >             return res;
> > >     }
> > > +
> > > +   if (pdev->vendor == PCI_VENDOR_ID_AMD) {
> > > +           while ((p = pci_get_device(PCI_VENDOR_ID_AMD, 0x14cd,
> > p))) {
> > > +                   pci_bridge = kmalloc(sizeof(struct tb_pci_bridge),
> > GFP_KERNEL);
> > > +                   if (!pci_bridge)
> > > +                           goto cleanup;
> > > +
> > > +                   pci_bridge->bridge = p;
> > > +                   INIT_LIST_HEAD(&pci_bridge->list);
> > > +                   list_add(&pci_bridge->list, &tb->bridge_list);
> > > +           }
> > > +   }
> >
> > You can't walk the device tree and create a "shadow" list of devices
> > like this and expect any lifetime rules to work properly with them at
> > all.
> >
> > Please do not do this.
> >
> > greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ