lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoDdntkJ8SFaqjPvkJoCDwiitqsCNeFUq7CYa_fajPQL4A@mail.gmail.com>
Date: Thu, 27 Nov 2025 20:49:45 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	bjorn@...nel.org, magnus.karlsson@...el.com, maciej.fijalkowski@...el.com, 
	jonathan.lemon@...il.com, sdf@...ichev.me, ast@...nel.org, 
	daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com, 
	bpf@...r.kernel.org, netdev@...r.kernel.org, 
	Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v3] xsk: skip validating skb list in xmit path

On Thu, Nov 27, 2025 at 8:02 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 11/25/25 12:57 PM, Jason Xing wrote:
> > This patch also removes total ~4% consumption which can be observed
> > by perf:
> > |--2.97%--validate_xmit_skb
> > |          |
> > |           --1.76%--netif_skb_features
> > |                     |
> > |                      --0.65%--skb_network_protocol
> > |
> > |--1.06%--validate_xmit_xfrm
> >
> > The above result has been verfied on different NICs, like I40E. I
> > managed to see the number is going up by 4%.
>
> I must admit this delta is surprising, and does not fit my experience in
> slightly different scenarios with the plain UDP TX path.

My take is that when the path is extremely hot, even the mathematics
calculation could cause unexpected overhead. You can see the pps is
now over 2,000,000. The reason why I say this is because I've done a
few similar tests to verify this thought.

>
> > [1] - analysis of the validate_xmit_skb()
> > 1. validate_xmit_unreadable_skb()
> >    xsk doesn't initialize skb->unreadable, so the function will not free
> >    the skb.
> > 2. validate_xmit_vlan()
> >    xsk also doesn't initialize skb->vlan_all.
> > 3. sk_validate_xmit_skb()
> >    skb from xsk_build_skb() doesn't have either sk_validate_xmit_skb or
> >    sk_state, so the skb will not be validated.
> > 4. netif_needs_gso()
> >    af_xdp doesn't support gso/tso.
> > 5. skb_needs_linearize() && __skb_linearize()
> >    skb doesn't have frag_list as always, so skb_has_frag_list() returns
> >    false. In copy mode, skb can put more data in the frags[] that can be
> >    found in xsk_build_skb_zerocopy().
>
> I'm not sure  parse this last sentence correctly, could you please
> re-phrase?
>
> I read it as as the xsk xmit path could build skb with nr_frags > 0.
> That in turn will need validation from
> validate_xmit_skb()/skb_needs_linearize() depending on the egress device
> (lack of NETIF_F_SG), regardless of any other offload required.

There are two paths where the allocation of frags happen:
1) xsk_build_skb() -> xsk_build_skb_zerocopy() -> skb_fill_page_desc()
-> shinfo->frags[i]
2) xsk_build_skb() -> skb_add_rx_frag() -> ... -> shinfo->frags[i]

Neither of them touch skb->frag_list, which means frag_list is NULL.
IIUC, there is no place where frag_list is used (which actually I
tested). we can see skb_needs_linearize() needs to check
skb_has_frag_list() first, so it will not proceed after seeing it
return false.

Does it make sense to you, I wonder?

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ