lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jan 2023 10:14:23 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Nick Child <nnac123@...ux.ibm.com>
Cc:     netdev@...r.kernel.org, bjking1@...ux.ibm.com, haren@...ux.ibm.com,
        ricklind@...ibm.com
Subject: Re: [PATCH net-next] ibmvnic: Toggle between queue types in
 affinity mapping

On Wed, 25 Jan 2023 10:55:20 -0600 Nick Child wrote:
> On 1/24/23 20:39, Jakub Kicinski wrote:
> > On Mon, 23 Jan 2023 16:17:27 -0600 Nick Child wrote:  
> >> A more optimal algorithm would balance the number RX and TX IRQ's across
> >> the physical cores. Therefore, to increase performance, distribute RX and
> >> TX IRQs across cores by alternating between assigning IRQs for RX and TX
> >> queues to CPUs.
> >> With a system with 64 CPUs and 32 queues, this results in the following
> >> pattern (binding is done in reverse order for readable code):
> >>
> >> IRQ type |  CPU number
> >> -----------------------
> >> TX15	 |	0-1
> >> RX15	 |	2-3
> >> TX14	 |	4-5
> >> RX14	 |	6-7  
> > 
> > Seems sensible but why did you invert the order? To save LoC?  
> 
> Thanks for checking this out Jakub.
> 
> Correct, the effect on performance is the same and IMO the algorithm
> is more readable. Less so about minimizing lines and more about
> making the code understandable for the next dev.

I spend way too much time explaining IRQ pinning to developers at my
"day job" :( Stuff like threaded NAPI means that more and more people
interact with it. So I think having a more easily understandable mapping
is worth the extra complexity in the driver. By which I mean:

Tx0 -> 0-1
Rx0 -> 2-3
Tx1 -> 4-5

IOW  Qn  -> n*4+is_rx*2 - n*4+is_rx*2+1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ