[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250716122725.6088-1-kerneljasonxing@gmail.com>
Date: Wed, 16 Jul 2025 20:27:25 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: davem@...emloft.net,
edumazet@...gle.com,
kuba@...nel.org,
pabeni@...hat.com,
bjorn@...nel.org,
magnus.karlsson@...el.com,
maciej.fijalkowski@...el.com,
jonathan.lemon@...il.com,
sdf@...ichev.me,
ast@...nel.org,
daniel@...earbox.net,
hawk@...nel.org,
john.fastabend@...il.com,
joe@...a.to,
willemdebruijn.kernel@...il.com
Cc: bpf@...r.kernel.org,
netdev@...r.kernel.org,
Jason Xing <kernelxing@...cent.com>
Subject: [PATCH net-next v2] xsk: skip validating skb list in xmit path
From: Jason Xing <kernelxing@...cent.com>
This patch only does one thing that removes validate_xmit_skb_list()
for xsk.
For xsk, it's not needed to validate and check the skb in
validate_xmit_skb_list() in copy mode because xsk_build_skb() doesn't
and doesn't need to prepare those requisites to validate. Xsk is just
responsible for delivering raw data from userspace to the driver.
The __dev_direct_xmit was taken out of af_packet in commit 865b03f21162
("dev: packet: make packet_direct_xmit a common function"). And a call
to validate_xmit_skb_list was added in commit 104ba78c9880 ("packet: on
direct_xmit, limit tso and csum to supported devices") to support TSO.
Since we don't support tso/vlan offloads in xsk_build_skb, we can remove
validate_xmit_skb_list for xsk. Skipping numerous checks somehow
contributes to the transmission especially in the extremely hot path.
Performance-wise, I used './xdpsock -i enp2s0f0np0 -t -S -s 64' to verify
the guess and then measured on the machine with ixgbe driver. It stably
goes up by 5.48%, which can be seen in the shown below:
Before:
sock0@...2s0f0np0:0 txonly xdp-skb
pps pkts 1.00
rx 0 0
tx 1,187,410 3,513,536
After:
sock0@...2s0f0np0:0 txonly xdp-skb
pps pkts 1.00
rx 0 0
tx 1,252,590 2,459,456
This patch also removes total ~4% consumption which can be observed
by perf:
|--2.97%--validate_xmit_skb
| |
| --1.76%--netif_skb_features
| |
| --0.65%--skb_network_protocol
|
|--1.06%--validate_xmit_xfrm
Signed-off-by: Jason Xing <kernelxing@...cent.com>
---
V2
Link: https://lore.kernel.org/all/20250713025756.24601-1-kerneljasonxing@gmail.com/
1. avoid adding a new flag
2. add more descriptions from Stan
---
include/linux/netdevice.h | 30 ++++++++++++++++++++----------
net/core/dev.c | 6 ------
2 files changed, 20 insertions(+), 16 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a80d21a14612..8e05c99928e1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3364,16 +3364,6 @@ static inline int dev_queue_xmit_accel(struct sk_buff *skb,
return __dev_queue_xmit(skb, sb_dev);
}
-static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
-{
- int ret;
-
- ret = __dev_direct_xmit(skb, queue_id);
- if (!dev_xmit_complete(ret))
- kfree_skb(skb);
- return ret;
-}
-
int register_netdevice(struct net_device *dev);
void unregister_netdevice_queue(struct net_device *dev, struct list_head *head);
void unregister_netdevice_many(struct list_head *head);
@@ -4301,6 +4291,26 @@ static __always_inline int ____dev_forward_skb(struct net_device *dev,
return 0;
}
+static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
+{
+ struct net_device *dev = skb->dev;
+ struct sk_buff *orig_skb = skb;
+ bool again = false;
+ int ret;
+
+ skb = validate_xmit_skb_list(skb, dev, &again);
+ if (skb != orig_skb) {
+ dev_core_stats_tx_dropped_inc(dev);
+ kfree_skb_list(skb);
+ return NET_XMIT_DROP;
+ }
+
+ ret = __dev_direct_xmit(skb, queue_id);
+ if (!dev_xmit_complete(ret))
+ kfree_skb(skb);
+ return ret;
+}
+
bool dev_nit_active_rcu(const struct net_device *dev);
static inline bool dev_nit_active(const struct net_device *dev)
{
diff --git a/net/core/dev.c b/net/core/dev.c
index e365b099484e..793f5d45c6b2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4744,19 +4744,13 @@ EXPORT_SYMBOL(__dev_queue_xmit);
int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
{
struct net_device *dev = skb->dev;
- struct sk_buff *orig_skb = skb;
struct netdev_queue *txq;
int ret = NETDEV_TX_BUSY;
- bool again = false;
if (unlikely(!netif_running(dev) ||
!netif_carrier_ok(dev)))
goto drop;
- skb = validate_xmit_skb_list(skb, dev, &again);
- if (skb != orig_skb)
- goto drop;
-
skb_set_queue_mapping(skb, queue_id);
txq = skb_get_tx_queue(dev, skb);
--
2.41.3
Powered by blists - more mailing lists