lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db8e457fcd155436449b035e8791a8241b0df400.camel@kernel.org>
Date: Fri, 06 Dec 2024 19:12:37 +0100
From: Niklas Schnelle <niks@...nel.org>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>, 
	linux-pci@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>, Lorenzo
 Pieralisi <lorenzo.pieralisi@....com>, Rob Herring <robh@...nel.org>,
 Krzysztof Wilczyński	 <kw@...ux.com>, "Maciej W . Rozycki"
 <macro@...am.me.uk>, Jonathan Cameron	 <Jonathan.Cameron@...wei.com>, Lukas
 Wunner <lukas@...ner.de>, Alexandru Gagniuc <mr.nuke.me@...il.com>, Krishna
 chaitanya chundru <quic_krichai@...cinc.com>, Srinivas Pandruvada
 <srinivas.pandruvada@...ux.intel.com>, "Rafael J . Wysocki"
 <rafael@...nel.org>, 	linux-pm@...r.kernel.org, Smita Koralahalli	
 <Smita.KoralahalliChannabasappa@....com>, linux-kernel@...r.kernel.org
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>, Amit Kucheria
 <amitk@...nel.org>,  Zhang Rui <rui.zhang@...el.com>, Christophe JAILLET
 <christophe.jaillet@...adoo.fr>, 	linux-pci@...r.kernel.org
Subject: Re: [PATCH v9 6/9] PCI/bwctrl: Re-add BW notification portdrv as
 PCIe BW controller

On Fri, 2024-10-18 at 17:47 +0300, Ilpo Järvinen wrote:
> This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove
> bandwidth notification"). An upcoming commit extends this driver
> building PCIe bandwidth controller on top of it.
> 
> The PCIe bandwidth notification were first added in the commit
> e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
> notification") but later had to be removed. The significant changes
> compared with the old bandwidth notification driver include:
> 
> 1) Don't print the notifications into kernel log, just keep the Link
>    Speed cached in struct pci_bus updated. While somewhat
> unfortunate,
>    the log spam was the source of complaints that eventually lead to
>    the removal of the bandwidth notifications driver (see the links
>    below for further information).
> 
> 2) Besides the Link Bandwidth Management Interrupt, enable also Link
>    Autonomous Bandwidth Interrupt to cover the other source of
>    bandwidth changes.
> 
> 3) Use threaded IRQ with IRQF_ONESHOT to handle Bandwidth
> Notification
>    Interrupts to address the problem fixed in the commit 3e82a7f9031f
>    ("PCI/LINK: Supply IRQ handler so level-triggered IRQs are
> acked")).
> 
> 4) Handle Link Speed updates robustly. Refresh the cached Link Speed
>    when enabling Bandwidth Notification Interrupts, and solve the
> race
>    between Link Speed read and LBMS/LABS update in
>    pcie_bwnotif_irq_thread().
> 
> 5) Use concurrency safe LNKCTL RMW operations.
> 
> 6) The driver is now called PCIe bwctrl (bandwidth controller)
> instead
>    of just bandwidth notifications because of increased scope and
>    functionality within the driver.
> 
> 7) Coexist with the Target Link Speed quirk in
>    pcie_failed_link_retrain(). Provide LBMS counting API for it.
> 
> 8) Tweaks to variable/functions names for consistency and length
>    reasons.
> 
> Bandwidth Notifications enable the cur_bus_speed in the struct
> pci_bus
> to keep track PCIe Link Speed changes.
> 
> Link:
> https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/
> Link:
> https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/
> Link:
> https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/
> Suggested-by: Lukas Wunner <lukas@...ner.de> # Building bwctrl on top
> of bwnotif
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> ---

Hi Ilpo,

I bisected a v6.13-rc1 boot hang on my personal workstation to this
patch. Sadly I don't have much details like a panic or so because the
boot hangs before any kernel messages, or at least they're not visible
long enough to see. I haven't yet looked into the code as I wanted to
raise awareness first. Since the commit doesn't revert cleanly on
v6.13-rc1 I also haven't tried that yet.

Here are some details on my system:
- AMD Ryzen 9 3900X 
- ASRock X570 Creator Motherboard
- Radeon RX 5600 XT
- Intel JHL7540 Thunderbolt 3 USB Controller (only USB 2 plugged)
- Intel 82599 10 Gigabit NIC with SR-IOV enabled with 2 VFs
- Intel n I211 Gigabit NIC
- Intel Wi-Fi 6 AX200
- Aquantia AQtion AQC107 NIC

If you have patches or things to try just ask.

Thanks,
Niklas


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ