netdev - Re: [PATCH net-next v2 0/9] tun: optimize SKB allocation with NAPI cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <willemdebruijn.kernel.199f9af074377@gmail.com>
Date: Fri, 28 Nov 2025 22:08:57 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jon Kohler <jon@...anix.com>, 
 netdev@...r.kernel.org, 
 Alexei Starovoitov <ast@...nel.org>, 
 Daniel Borkmann <daniel@...earbox.net>, 
 "David S. Miller" <davem@...emloft.net>, 
 Jakub Kicinski <kuba@...nel.org>, 
 Jesper Dangaard Brouer <hawk@...nel.org>, 
 John Fastabend <john.fastabend@...il.com>, 
 Stanislav Fomichev <sdf@...ichev.me>, 
 "(open list:XDP \\(eXpress Data Path\\):Keyword:\\(?:\\b|_\\)xdp\\(?:\\b|_\\))" <bpf@...r.kernel.org>
Cc: Jon Kohler <jon@...anix.com>
Subject: Re: [PATCH net-next v2 0/9] tun: optimize SKB allocation with NAPI
 cache

Jon Kohler wrote:
> Use the per-CPU NAPI cache for SKB allocation in most places, and
> leverage bulk allocation for tun_xdp_one since the batch size is known
> at submission time. Additionally, utilize napi_build_skb and
> napi_consume_skb to further benefit from the NAPI cache. This all
> improves efficiency by reducing allocation overhead. 
> 
> Note: This series does not address the large payload path in
> tun_alloc_skb, which spans sock.c and skbuff.c,A separate series will
> handle privatizing the allocation code in tun and integrating the NAPI
> cache for that path.
> 
> Results using basic iperf3 UDP test:
> TX guest: taskset -c 2 iperf3 -c rx-ip-here -t 30 -p 5200 -b 0 -u -i 30
> RX guest: taskset -c 2 iperf3 -s -p 5200 -D
> 
>         Bitrate       
> Before: 6.08 Gbits/sec
> After : 6.36 Gbits/sec
> 
> However, the basic test doesn't tell the whole story. Looking at a
> flamegraph from before and after, less cycles are spent both on RX
> vhost thread in the guest-to-guest on a single host case, but also less
> cycles in the guest-to-guest case when on separate hosts, as the host
> NIC handlers benefit from these NAPI-allocated SKBs (and deferred free)
> as well.
> 
> Speaking of deferred free, v2 adds exporting deferred free from net
> core and using immediately prior in tun_put_user. This not only keeps
> the cache as warm as you can get, but also prevents a TX heavy vhost
> thread from getting IPI'd like its going out of style. This approach
> is similar in concept to what happens from NAPI loop in net_rx_action.
> 
> I've also merged this series with a small series about cleaning up
> packet drop statistics along the various error paths in tun, as I want
> to make sure those all go through kfree_skb_reason(), and we'd have
> merge conflicts separating the two. If the maintainers want to take
> them separately, happy to break them apart if needed. It is fairly
> clean keeping them together otherwise.

I think it would be preferable to send the cleanup separately, first.

Why would that cause merge conflicts?