lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFGp8tXaL7NCORhk@mini-arch>
Date: Tue, 17 Jun 2025 10:46:26 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, bjorn@...nel.org, magnus.karlsson@...el.com,
	maciej.fijalkowski@...el.com, jonathan.lemon@...il.com,
	sdf@...ichev.me, ast@...nel.org, daniel@...earbox.net,
	hawk@...nel.org, john.fastabend@...il.com, bpf@...r.kernel.org,
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next 0/2] net: xsk: add two sysctl knobs

On 06/17, Jason Xing wrote:
> Hi Stanislav,
> 
> On Tue, Jun 17, 2025 at 9:11 AM Stanislav Fomichev <stfomichev@...il.com> wrote:
> >
> > On 06/17, Jason Xing wrote:
> > > From: Jason Xing <kernelxing@...cent.com>
> > >
> > > Introduce a control method in the xsk path to let users have the chance
> > > to tune it manually.
> >
> > Can you expand more on why the defaults don't work for you?
> 
> We use a user-level tcp stack with xsk to transmit packets that have
> higher priorities than other normal kernel tcp flows. It turns out
> that enlarging the number can minimize times of triggering sendto
> sysctl, which contributes to faster transmission. it's very easy to
> hit the upper bound (namely, 32) if you log the return value of
> sendto. I mentioned a bit about this in the second patch, saying that
> we can have a similar knob already appearing in the qdisc layer.
> Furthermore, exposing important parameters can help applications
> complete their AI/auto-tuning to judge which one is the best fit in
> their production workload. That is also one of the promising
> tendencies :)
> 
> >
> > Also, can we put these settings into the socket instead of (global/ns)
> > sysctl?
> 
> As to MAX_PER_SOCKET_BUDGET, it seems not easy to get its
> corresponding netns? I have no strong opinion on this point for now.

I'm suggesting something along these lines (see below). And then add
some way to configure it (plus, obviously, set the default value
on init). There is also a question on whether you need separate
values for MAX_PER_SOCKET_BUDGET and TX_BATCH_SIZE, and if yes,
then why.

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 72c000c0ae5f..fb2caec9914d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -424,7 +424,7 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc)
 	rcu_read_lock();
 again:
 	list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
-		if (xs->tx_budget_spent >= MAX_PER_SOCKET_BUDGET) {
+		if (xs->tx_budget_spent >= xs->max_tx_budget) {
 			budget_exhausted = true;
 			continue;
 		}
@@ -779,7 +779,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 static int __xsk_generic_xmit(struct sock *sk)
 {
 	struct xdp_sock *xs = xdp_sk(sk);
-	u32 max_batch = TX_BATCH_SIZE;
 	bool sent_frame = false;
 	struct xdp_desc desc;
 	struct sk_buff *skb;
@@ -797,7 +796,7 @@ static int __xsk_generic_xmit(struct sock *sk)
 		goto out;
 
 	while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) {
-		if (max_batch-- == 0) {
+		if (xs->max_tx_budget-- == 0) {
 			err = -EAGAIN;
 			goto out;
 		}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ