netdev - Re: [PATCH net-next] tcp: try to defer / return acked skbs to originating CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iKmuoXJtw4WZ0MRZE3WE-a-VtfTiWamSzXX0dx8pUcRqg@mail.gmail.com>
Date: Sat, 17 Jan 2026 19:16:57 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: kuniyu@...gle.com, ncardwell@...gle.com, netdev@...r.kernel.org, 
	davem@...emloft.net, pabeni@...hat.com, andrew+netdev@...n.ch, 
	horms@...nel.org
Subject: Re: [PATCH net-next] tcp: try to defer / return acked skbs to
 originating CPU

On Sat, Jan 17, 2026 at 5:43 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Running a memcache-like workload under production(ish) load
> on a 300 thread AMD machine we see ~3% of CPU time spent
> in kmem_cache_free() via tcp_ack(), freeing skbs from rtx queue.
> This workloads pins workers away from softirq CPU so
> the Tx skbs are pretty much always allocated on a different
> CPU than where the ACKs arrive. Try to use the defer skb free
> queue to return the skbs back to where they came from.
> This results in a ~4% performance improvement for the workload.
>

This probably makes sense when RFS is not used.
Here, RFS gives us ~40% performance improvement for typical RPC workloads,
so I never took a look at this side :)

Have you tested what happens for bulk sends ?
sendmsg() allocates skbs and push them to transmit queue,
but ACK can decide to split TSO packets, and the new allocation is done
on the softirq CPU (assuming RFS is not used)

Perhaps tso_fragment()/tcp_fragment() could copy the source
skb->alloc_cpu to (new)buff->alloc_cpu.

Also, if workers are away from softirq, they will only process the
defer queue in large patches, after receiving an trigger_rx_softirq()
IPI.
Any idea of skb_defer_free_flush() latency when dealing with batches
of ~64 big TSO packets ?



> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
>  include/net/tcp.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index ef0fee58fde8..e290651da508 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -332,7 +332,7 @@ static inline void tcp_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
>                 sk_mem_uncharge(sk, skb->truesize);
>         else
>                 sk_mem_uncharge(sk, SKB_TRUESIZE(skb_end_offset(skb)));
> -       __kfree_skb(skb);
> +       skb_attempt_defer_free(skb);
>  }
>
>  void sk_forced_mem_schedule(struct sock *sk, int size);
> --
> 2.52.0
>