lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAVpQUC+7GWuGxZa4=3k3XCNSuLddpZbhoeEmmpWe930jpycWA@mail.gmail.com>
Date: Tue, 14 Oct 2025 20:32:25 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>, 
	Simon Horman <horms@...nel.org>, Willem de Bruijn <willemb@...gle.com>, netdev@...r.kernel.org, 
	eric.dumazet@...il.com
Subject: Re: [PATCH v1 net-next 2/4] net: control skb->ooo_okay from skb_set_owner_w()

On Mon, Oct 13, 2025 at 8:22 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> 15 years after Tom Herbert added skb->ooo_okay, only TCP transport
> benefits from it.
>
> We can support other transports directly from skb_set_owner_w().
>
> If no other TX packet for this socket is in a host queue (qdisc, NIC queue)
> there is no risk of self-inflicted reordering, we can set skb->ooo_okay.
>
> This allows netdev_pick_tx() to choose a TX queue based on XPS settings,
> instead of reusing the queue chosen at the time the first packet was sent
> for connected sockets.
>
> Tested:
>   500 concurrent UDP_RR connected UDP flows, host with 32 TX queues,
>   512 cpus, XPS setup.
>
>   super_netperf 500 -t UDP_RR -H <host> -l 1000 -- -r 100,100 -Nn &
>
> This patch saves between 10% and 20% of cycles, depending on how
> process scheduler migrates threads among cpus.
>
> Using following bpftrace script, we can see the effect on Qdisc/NIC tx queues
> being better used (less cache line misses).
>
> bpftrace -e '
> k:__dev_queue_xmit { @start[cpu] = nsecs; }
> kr:__dev_queue_xmit {
>  if (@start[cpu]) {
>     $delay = nsecs - @start[cpu];
>     delete(@start[cpu]);
>     @__dev_queue_xmit_ns = hist($delay);
>  }
> }
> END { clear(@start); }'
>
> Before:
> @__dev_queue_xmit_ns:
> [128, 256)             6 |                                                    |
> [256, 512)        116283 |                                                    |
> [512, 1K)        1888205 |@@@@@@@@@@@                                         |
> [1K, 2K)         8106167 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    |
> [2K, 4K)         8699293 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [4K, 8K)         2600676 |@@@@@@@@@@@@@@@                                     |
> [8K, 16K)         721688 |@@@@                                                |
> [16K, 32K)        122995 |                                                    |
> [32K, 64K)         10639 |                                                    |
> [64K, 128K)          119 |                                                    |
> [128K, 256K)           1 |                                                    |
>
> After:
> @__dev_queue_xmit_ns:
> [128, 256)             3 |                                                    |
> [256, 512)        651112 |@@                                                  |
> [512, 1K)        8109938 |@@@@@@@@@@@@@@@@@@@@@@@@@@                          |
> [1K, 2K)        16081031 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [2K, 4K)         2411692 |@@@@@@@                                             |
> [4K, 8K)           98994 |                                                    |
> [8K, 16K)           1536 |                                                    |
> [16K, 32K)           587 |                                                    |
> [32K, 64K)             2 |                                                    |
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reviewed-by: Neal Cardwell <ncardwell@...gle.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@...gle.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ