lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Nov 2023 10:47:48 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Mario Limonciello <mario.limonciello@....com>
Cc:     bhelgaas@...gle.com, mika.westerberg@...ux.intel.com,
        andreas.noever@...il.com, michael.jamet@...el.com,
        YehezkelShB@...il.com, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
        Alexander.Deucher@....com
Subject: Re: [PATCH 2/2] PCI: Ignore PCIe ports used for tunneling in
 pcie_bandwidth_available()

On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> On 11/1/2023 17:52, Bjorn Helgaas wrote:
> > On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
> > > The USB4 spec specifies that PCIe ports that are used for tunneling
> > > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
> > > 
> > > In reality these ports speed is controlled by the fabric implementation.
> > 
> > So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
> > the actual speed?  And we don't have a generic way to find the actual
> > speed?
> 
> Correct.
> 
> > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
> > > to program the device will always find the PCIe ports used for
> > > tunneling as a limiting factor and may make incorrect decisions.
> > > 
> > > To prevent problems in downstream drivers check explicitly for ports
> > > being used for PCIe tunneling and skip them when looking for bandwidth
> > > limitations.
> > > 
> > > 2 types of devices are detected:
> > > 1) PCIe root port used for PCIe tunneling
> > > 2) Intel Thunderbolt 3 bridge
> > > 
> > > Downstream drivers could make this change on their own but then they
> > > wouldn't be able to detect other potential speed bottlenecks.
> > 
> > Is the implication that a tunneling port can *never* be a speed
> > bottleneck?  That seems to be how this patch would work in practice.
> 
> I think that's a stretch we should avoid concluding.

I'm just reading the description and the patch, which seem to say that
pcie_bandwidth_available() will never report a tunneling port as the
limiting port.

Maybe this can be rectified with a comment about how we can't tell the
actual bandwidth of a tunneled port, and it may be a hidden unreported
bottleneck, so pcie_bandwidth_available() can't actually return a
reliable result.  Seems sort of unsatisfactory, but ... I dunno, maybe
it's the best we can do.

> IIUC the fabric can be hosting other traffic and it's entirely possible the
> traffic over the tunneling port runs more slowly at times.
> 
> Perhaps that's why the the USB4 spec decided to advertise it this way? I
> don't know.

Maybe, although the same happens on shared PCIe links above switches.

> > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
> > 
> > This issue says the external GPU doesn't work at all.  Does this patch
> > fix that?  This patch looks like it might improve GPU performance, but
> > wouldn't fix something that didn't work at all.
> 
> The issue actually identified 4 distinct different problems.  The 3 problems
> will be fixed in amdgpu which are functional.
> 
> This performance one was from later in the ticket after some back and forth
> identifying proper solutions for the first 3.

There's a lot of material in that report.  Is there a way to link to
the specific part related to performance?

> > > + * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.

Wrap to fit in 80 columns like the rest of the file.

> > >    */
> > >   u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   			     enum pci_bus_speed *speed,
> > > @@ -6254,6 +6290,10 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   	bw = 0;
> > >   	while (dev) {
> > > +		/* skip root ports and bridges used for tunneling */
> > > +		if (pcie_is_tunneling_port(dev))
> > > +			goto skip;
> > > +
> > >   		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > >   		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
> > > @@ -6274,6 +6314,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   				*width = next_width;
> > >   		}
> > > +skip:
> > >   		dev = pci_upstream_bridge(dev);
> > >   	}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ