lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <928df647-5b20-406b-8da5-3199f5cfbb48@amd.com>
Date:   Wed, 1 Nov 2023 20:14:31 -0500
From:   Mario Limonciello <mario.limonciello@....com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     bhelgaas@...gle.com, mika.westerberg@...ux.intel.com,
        andreas.noever@...il.com, michael.jamet@...el.com,
        YehezkelShB@...il.com, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
        Alexander.Deucher@....com
Subject: Re: [PATCH 2/2] PCI: Ignore PCIe ports used for tunneling in
 pcie_bandwidth_available()

On 11/1/2023 17:52, Bjorn Helgaas wrote:
> On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
>> The USB4 spec specifies that PCIe ports that are used for tunneling
>> PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
>>
>> In reality these ports speed is controlled by the fabric implementation.
> 
> So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
> the actual speed?  And we don't have a generic way to find the actual
> speed?

Correct.

> 
>> Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
>> to program the device will always find the PCIe ports used for
>> tunneling as a limiting factor and may make incorrect decisions.
>>
>> To prevent problems in downstream drivers check explicitly for ports
>> being used for PCIe tunneling and skip them when looking for bandwidth
>> limitations.
>>
>> 2 types of devices are detected:
>> 1) PCIe root port used for PCIe tunneling
>> 2) Intel Thunderbolt 3 bridge
>>
>> Downstream drivers could make this change on their own but then they
>> wouldn't be able to detect other potential speed bottlenecks.
> 
> Is the implication that a tunneling port can *never* be a speed
> bottleneck?  That seems to be how this patch would work in practice.

I think that's a stretch we should avoid concluding.

IIUC the fabric can be hosting other traffic and it's entirely possible 
the traffic over the tunneling port runs more slowly at times.

Perhaps that's why the the USB4 spec decided to advertise it this way? 
I don't know.

> 
>> Link: https://lore.kernel.org/linux-pci/7ad4b2ce-4ee4-429d-b5db-3dfc360f4c3e@amd.com/
>> Link: https://www.usb.org/document-library/usb4r-specification-v20
>>        USB4 V2 with Errata and ECN through June 2023 - CLEAN p710
> 
> I guess this is sec 11.2.1 ("PCIe Physical Layer Logical Sub-block")
> on PDF p710 (labeled "666" on the printed page).  How annoying that
> the PDF page numbers don't match the printed ones; do the section
> numbers at least stay stable in new spec revisions?

I'd hope so.  I'll change it to section numbers in the next revision.

> 
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
> 
> This issue says the external GPU doesn't work at all.  Does this patch
> fix that?  This patch looks like it might improve GPU performance, but
> wouldn't fix something that didn't work at all.

The issue actually identified 4 distinct different problems.  The 3 
problems will be fixed in amdgpu which are functional.

This performance one was from later in the ticket after some back and 
forth identifying proper solutions for the first 3.

> 
>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>> ---
>>   drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 41 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 59c01d68c6d5..4a7dc9c2b8f4 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6223,6 +6223,40 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>>   }
>>   EXPORT_SYMBOL(pcie_set_mps);
>>   
>> +/**
>> + * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
>> + * @dev: PCI device to check
>> + *
>> + * Returns true if the device is used for PCIe tunneling, false otherwise.
>> + */
>> +static bool
>> +pcie_is_tunneling_port(struct pci_dev *pdev)
> 
> Use usual function signature styling (all on one line).

OK.

> 
>> +{
>> +	struct device_link *link;
>> +	struct pci_dev *supplier;
>> +
>> +	/* Intel TBT3 bridge */
>> +	if (pdev->is_thunderbolt)
>> +		return true;
>> +
>> +	if (!pci_is_pcie(pdev))
>> +		return false;
>> +
>> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
>> +		return false;
>> +
>> +	/* PCIe root port used for tunneling linked to USB4 router */
>> +	list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
>> +		supplier = to_pci_dev(link->supplier);
>> +		if (!supplier)
>> +			continue;
>> +		if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
>> +			return true;
> 
> Since this is in drivers/pci, and this USB4/Thunderbolt routing is not
> covered by the PCIe specs, this is basically black magic.  Is there a
> reference to the USB4 spec we could include to help make it less
> magical?

The "magic" part is that there is an ACPI construct to indicate a PCIe 
port is linked to a USB4 router.

Here is a link to the page that is explained:
https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie

In the Linux side this link is created in the 'thunderbolt' driver.

Thinking about this again, this does actually mean we could have a 
different result based on whether pcie_bandwidth_available() is called 
before or after the 'thunderbolt' driver has loaded.

For example if a GPU driver that called pcie_bandwidth_available() was 
in the initramfs but 'thunderbolt' was in the rootfs we might end up 
with the wrong result again.

Considering this I think it's a good idea to move that creation of the 
device link into drivers/pci/pci-acpi.c and store a bit in struct 
pci_device to indicate it's a tunneled port.

Then 'thunderbolt' can look for this directly instead of walking all the 
FW nodes.

pcie_bandwidth_available() can just look at the tunneled port bit 
instead of the existence of the device link.

> 
> Lukas' brief intro in
> https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
> helped me connect a few dots, because things like
> Documentation/admin-guide/thunderbolt.rst assume we already know those
> details.

Thanks for sharing that.  If I move the detection mechanism as I 
suggested above I'll reference some of that as well in the commit 
message to explain what exactly a tunneled port is.

> 
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>   /**
>>    * pcie_bandwidth_available - determine minimum link settings of a PCIe
>>    *			      device and its bandwidth limitation
>> @@ -6236,6 +6270,8 @@ EXPORT_SYMBOL(pcie_set_mps);
>>    * limiting_dev, speed, and width pointers are supplied) information about
>>    * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
>>    * raw bandwidth.
>> + *
>> + * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.
>>    */
>>   u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   			     enum pci_bus_speed *speed,
>> @@ -6254,6 +6290,10 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   	bw = 0;
>>   
>>   	while (dev) {
>> +		/* skip root ports and bridges used for tunneling */
>> +		if (pcie_is_tunneling_port(dev))
>> +			goto skip;
>> +
>>   		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
>>   
>>   		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
>> @@ -6274,6 +6314,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   				*width = next_width;
>>   		}
>>   
>> +skip:
>>   		dev = pci_upstream_bridge(dev);
>>   	}
>>   
>> -- 
>> 2.34.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ