[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoD6gkyirk=CQ0yepk3fkZVf+28Zc7nUoModeCT3Po7VYQ@mail.gmail.com>
Date: Sat, 21 Jun 2025 09:06:03 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Stanislav Fomichev <stfomichev@...il.com>, Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net,
edumazet@...gle.com, pabeni@...hat.com, bjorn@...nel.org,
magnus.karlsson@...el.com, maciej.fijalkowski@...el.com,
jonathan.lemon@...il.com, sdf@...ichev.me, ast@...nel.org,
daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com, joe@...a.to,
bpf@...r.kernel.org, netdev@...r.kernel.org,
Jason Xing <kernelxing@...cent.com>, magnus.karlsson@...il.com, skhawaja@...gle.com
Subject: Re: [PATCH net-next v3] net: xsk: introduce XDP_MAX_TX_BUDGET set/getsockopt
On Sat, Jun 21, 2025 at 6:20 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> Stanislav Fomichev wrote:
> > On 06/19, Jakub Kicinski wrote:
> > > On Thu, 19 Jun 2025 17:04:40 +0800 Jason Xing wrote:
> > > > @@ -424,7 +421,9 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc)
> > > > rcu_read_lock();
> > > > again:
> > > > list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
> > > > - if (xs->tx_budget_spent >= MAX_PER_SOCKET_BUDGET) {
> > > > + int max_budget = READ_ONCE(xs->max_tx_budget);
> > > > +
> > > > + if (xs->tx_budget_spent >= max_budget) {
> > > > budget_exhausted = true;
> > > > continue;
> > > > }
> > > > @@ -779,7 +778,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> > > > static int __xsk_generic_xmit(struct sock *sk)
> > > > {
> > > > struct xdp_sock *xs = xdp_sk(sk);
> > > > - u32 max_batch = TX_BATCH_SIZE;
> > > > + u32 max_budget = READ_ONCE(xs->max_tx_budget);
> > >
> > > Hm, maybe a question to Stan / Willem & other XSK experts but are these
> > > two max values / code paths really related? Question 2 -- is generic
> > > XSK a legit optimization target, legit enough to add uAPI?
> >
> > 1) xsk_tx_peek_desc is for zc case and xsk_build_skb is copy mode;
> > whether we want to affect zc case given the fact that Jason seemingly
> > cares about copy mode is a good question.
>
> The two constants seem to be only tangentially created.
>
> If there is fear that one tunable modifies both, it is simple enough
> to remove the unnecessary dependency and only tune the first.
Last night I was thinking only let the first one take effect and keep
the zc mode untouched. IIUC, if we publish the uAPI (setsockopt),
we're _not_ allowed to remove the unwanted one when we spot some
problems? If so, I will only make the copy mode work :) If the latter
needs this in the future, another setsockopt can be easily added. Am I
right?
>
> > 2) I do find it surprising as well. Recent busy polling patches were
> > also using/targeting copy mode. But from my pow, if people use it, I see
> > no reason to make it more usable.
>
> That's a very fair question.
>
> Jason, have you tried XDP_ZEROCOPY? It's quite plausible that that
> would address your issue.
Not yet, but I will try and verify maybe at the beginning of next
month as you suggested.
>
> I have had a use for copy mode, but that was rather obscure. A small
> packet workload where copy cost is negligible, and with copy mode it
> was easy to make to reinstate flow steering in XDP to specific XSKs,
> akin to
>
> https://lore.kernel.org/netdev/65c0f032ac71a_7396029419@willemb.c.googlers.com.notmuch/
Interesting. Thanks for the pointer :)
>
> The main issue with that remained that driver copy mode also implies
> the slower generic copy based Tx path, which goes through the full
> dev_queue_xmit stack.
To be more specific, not the full dev_queue_xmit stack as
__xsk_generic_xmit() bypasses the qdisc layer.
Thanks,
Jason
Powered by blists - more mailing lists