lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250716122725.6088-1-kerneljasonxing@gmail.com>
Date: Wed, 16 Jul 2025 20:27:25 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: davem@...emloft.net,
	edumazet@...gle.com,
	kuba@...nel.org,
	pabeni@...hat.com,
	bjorn@...nel.org,
	magnus.karlsson@...el.com,
	maciej.fijalkowski@...el.com,
	jonathan.lemon@...il.com,
	sdf@...ichev.me,
	ast@...nel.org,
	daniel@...earbox.net,
	hawk@...nel.org,
	john.fastabend@...il.com,
	joe@...a.to,
	willemdebruijn.kernel@...il.com
Cc: bpf@...r.kernel.org,
	netdev@...r.kernel.org,
	Jason Xing <kernelxing@...cent.com>
Subject: [PATCH net-next v2] xsk: skip validating skb list in xmit path

From: Jason Xing <kernelxing@...cent.com>

This patch only does one thing that removes validate_xmit_skb_list()
for xsk.

For xsk, it's not needed to validate and check the skb in
validate_xmit_skb_list() in copy mode because xsk_build_skb() doesn't
and doesn't need to prepare those requisites to validate. Xsk is just
responsible for delivering raw data from userspace to the driver.

The __dev_direct_xmit was taken out of af_packet in commit 865b03f21162
("dev: packet: make packet_direct_xmit a common function"). And a call
to validate_xmit_skb_list was added in commit 104ba78c9880 ("packet: on
direct_xmit, limit tso and csum to supported devices") to support TSO.
Since we don't support tso/vlan offloads in xsk_build_skb, we can remove
validate_xmit_skb_list for xsk. Skipping numerous checks somehow
contributes to the transmission especially in the extremely hot path.

Performance-wise, I used './xdpsock -i enp2s0f0np0 -t  -S -s 64' to verify
the guess and then measured on the machine with ixgbe driver. It stably
goes up by 5.48%, which can be seen in the shown below:
Before:
 sock0@...2s0f0np0:0 txonly xdp-skb
                   pps            pkts           1.00
rx                 0              0
tx                 1,187,410      3,513,536
After:
 sock0@...2s0f0np0:0 txonly xdp-skb
                   pps            pkts           1.00
rx                 0              0
tx                 1,252,590      2,459,456

This patch also removes total ~4% consumption which can be observed
by perf:
|--2.97%--validate_xmit_skb
|          |
|           --1.76%--netif_skb_features
|                     |
|                      --0.65%--skb_network_protocol
|
|--1.06%--validate_xmit_xfrm

Signed-off-by: Jason Xing <kernelxing@...cent.com>
---
V2
Link: https://lore.kernel.org/all/20250713025756.24601-1-kerneljasonxing@gmail.com/
1. avoid adding a new flag
2. add more descriptions from Stan
---
 include/linux/netdevice.h | 30 ++++++++++++++++++++----------
 net/core/dev.c            |  6 ------
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a80d21a14612..8e05c99928e1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3364,16 +3364,6 @@ static inline int dev_queue_xmit_accel(struct sk_buff *skb,
 	return __dev_queue_xmit(skb, sb_dev);
 }
 
-static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
-{
-	int ret;
-
-	ret = __dev_direct_xmit(skb, queue_id);
-	if (!dev_xmit_complete(ret))
-		kfree_skb(skb);
-	return ret;
-}
-
 int register_netdevice(struct net_device *dev);
 void unregister_netdevice_queue(struct net_device *dev, struct list_head *head);
 void unregister_netdevice_many(struct list_head *head);
@@ -4301,6 +4291,26 @@ static __always_inline int ____dev_forward_skb(struct net_device *dev,
 	return 0;
 }
 
+static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
+{
+	struct net_device *dev = skb->dev;
+	struct sk_buff *orig_skb = skb;
+	bool again = false;
+	int ret;
+
+	skb = validate_xmit_skb_list(skb, dev, &again);
+	if (skb != orig_skb) {
+		dev_core_stats_tx_dropped_inc(dev);
+		kfree_skb_list(skb);
+		return NET_XMIT_DROP;
+	}
+
+	ret = __dev_direct_xmit(skb, queue_id);
+	if (!dev_xmit_complete(ret))
+		kfree_skb(skb);
+	return ret;
+}
+
 bool dev_nit_active_rcu(const struct net_device *dev);
 static inline bool dev_nit_active(const struct net_device *dev)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index e365b099484e..793f5d45c6b2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4744,19 +4744,13 @@ EXPORT_SYMBOL(__dev_queue_xmit);
 int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
 {
 	struct net_device *dev = skb->dev;
-	struct sk_buff *orig_skb = skb;
 	struct netdev_queue *txq;
 	int ret = NETDEV_TX_BUSY;
-	bool again = false;
 
 	if (unlikely(!netif_running(dev) ||
 		     !netif_carrier_ok(dev)))
 		goto drop;
 
-	skb = validate_xmit_skb_list(skb, dev, &again);
-	if (skb != orig_skb)
-		goto drop;
-
 	skb_set_queue_mapping(skb, queue_id);
 	txq = skb_get_tx_queue(dev, skb);
 
-- 
2.41.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ