lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 7 Feb 2024 12:40:57 +0200
From: Mathias Nyman <mathias.nyman@...ux.intel.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Cc: "Christian A. Ehrhardt" <lk@...e.de>, niklas.neronin@...ux.intel.com,
 Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
 Greg KH <gregkh@...uxfoundation.org>, linux-usb@...r.kernel.org
Subject: Re: This is the fourth time I’ve tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On 6.2.2024 18.12, Mikhail Gavrilov wrote:
> On Tue, Feb 6, 2024 at 4:24 PM Mathias Nyman
> <mathias.nyman@...ux.intel.com> wrote:
> 
> I confirm after reverting all listed commits and 57e153dfd0e7
> performance of the network returned to theoretical maximum.
> 
>> That patch changes how we request MSI/MSI-X interrupt(s) for xhci.
>>
>> Is there any change is /proc/interrupts between a good and bad case?
>> Such as xhci_hcd using MSI-X instead of MSI, or eth0 and xhci_hcd
>> interrupting on the same CPU?
> 
> On the good kernel I have - 32 xhci_hcd, and bad only - 4.
> In both scenarios using PCI-MSIX.
> I attached both interrupt output as archives to this message.
> 


Thanks,

Looks like your network adapter ends up interrupting CPU0 in the bad case due
to the change in how many interrupts are requested by xhci_hcd before it.

bad case:
	CPU0	CPU1	...	CPU31
87:	18213809 0	... 	0	IR-PCI-MSIX-0000:0e:00.0    0-edge      enp14s0

Does manually changing it to some other CPU help? picking one that doesn't already
handle a lot of interrupts. CPU0 could also in general be more busy, possibly spending
more time with interrupts disabled.

For example change to CPU23 in the bad case:

echo 800000 > /proc/irq/87/smp_affinity

Check from proc/interrupts that enp14s0 interrupts actually go to CPU23 after this.

Thanks
Mathias


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ