[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFGp8tXaL7NCORhk@mini-arch>
Date: Tue, 17 Jun 2025 10:46:26 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, bjorn@...nel.org, magnus.karlsson@...el.com,
maciej.fijalkowski@...el.com, jonathan.lemon@...il.com,
sdf@...ichev.me, ast@...nel.org, daniel@...earbox.net,
hawk@...nel.org, john.fastabend@...il.com, bpf@...r.kernel.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next 0/2] net: xsk: add two sysctl knobs
On 06/17, Jason Xing wrote:
> Hi Stanislav,
>
> On Tue, Jun 17, 2025 at 9:11 AM Stanislav Fomichev <stfomichev@...il.com> wrote:
> >
> > On 06/17, Jason Xing wrote:
> > > From: Jason Xing <kernelxing@...cent.com>
> > >
> > > Introduce a control method in the xsk path to let users have the chance
> > > to tune it manually.
> >
> > Can you expand more on why the defaults don't work for you?
>
> We use a user-level tcp stack with xsk to transmit packets that have
> higher priorities than other normal kernel tcp flows. It turns out
> that enlarging the number can minimize times of triggering sendto
> sysctl, which contributes to faster transmission. it's very easy to
> hit the upper bound (namely, 32) if you log the return value of
> sendto. I mentioned a bit about this in the second patch, saying that
> we can have a similar knob already appearing in the qdisc layer.
> Furthermore, exposing important parameters can help applications
> complete their AI/auto-tuning to judge which one is the best fit in
> their production workload. That is also one of the promising
> tendencies :)
>
> >
> > Also, can we put these settings into the socket instead of (global/ns)
> > sysctl?
>
> As to MAX_PER_SOCKET_BUDGET, it seems not easy to get its
> corresponding netns? I have no strong opinion on this point for now.
I'm suggesting something along these lines (see below). And then add
some way to configure it (plus, obviously, set the default value
on init). There is also a question on whether you need separate
values for MAX_PER_SOCKET_BUDGET and TX_BATCH_SIZE, and if yes,
then why.
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 72c000c0ae5f..fb2caec9914d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -424,7 +424,7 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc)
rcu_read_lock();
again:
list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
- if (xs->tx_budget_spent >= MAX_PER_SOCKET_BUDGET) {
+ if (xs->tx_budget_spent >= xs->max_tx_budget) {
budget_exhausted = true;
continue;
}
@@ -779,7 +779,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
static int __xsk_generic_xmit(struct sock *sk)
{
struct xdp_sock *xs = xdp_sk(sk);
- u32 max_batch = TX_BATCH_SIZE;
bool sent_frame = false;
struct xdp_desc desc;
struct sk_buff *skb;
@@ -797,7 +796,7 @@ static int __xsk_generic_xmit(struct sock *sk)
goto out;
while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) {
- if (max_batch-- == 0) {
+ if (xs->max_tx_budget-- == 0) {
err = -EAGAIN;
goto out;
}
Powered by blists - more mailing lists