[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z4AJD97LFmjfCrc2@LQ3V64L9R2>
Date: Thu, 9 Jan 2025 09:36:15 -0800
From: Joe Damato <jdamato@...tly.com>
To: Magnus Karlsson <magnus.karlsson@...il.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, horms@...nel.org, ast@...nel.org,
daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com,
bjorn@...nel.org, magnus.karlsson@...el.com,
maciej.fijalkowski@...el.com, jonathan.lemon@...il.com,
mkarsten@...terloo.ca
Subject: Re: [PATCH net] xsk: Bring back busy polling support
On Thu, Jan 09, 2025 at 04:22:16PM +0100, Magnus Karlsson wrote:
> On Thu, 9 Jan 2025 at 01:35, Stanislav Fomichev <sdf@...ichev.me> wrote:
> >
> > Commit 86e25f40aa1e ("net: napi: Add napi_config") moved napi->napi_id
> > assignment to a later point in time (napi_hash_add_with_id). This breaks
> > __xdp_rxq_info_reg which copies napi_id at an earlier time and now
> > stores 0 napi_id. It also makes sk_mark_napi_id_once_xdp and
> > __sk_mark_napi_id_once useless because they now work against 0 napi_id.
> > Since sk_busy_loop requires valid napi_id to busy-poll on, there is no way
> > to busy-poll AF_XDP sockets anymore.
> >
> > Bring back the ability to busy-poll on XSK by resolving socket's napi_id
> > at bind time. This relies on relatively recent netif_queue_set_napi,
> > but (assume) at this point most popular drivers should have been converted.
> > This also removes per-tx/rx cycles which used to check and/or set
> > the napi_id value.
> >
> > Confirmed by running a busy-polling AF_XDP socket
> > (github.com/fomichev/xskrtt) on mlx5 and looking at BusyPollRxPackets
> > from /proc/net/netstat.
>
> Thanks Stanislav for finding and fixing this. As a bonus, the
> resulting code is much nicer too.
>
> I just took a look at the Intel drivers and some of our drivers have
> not been converted to use netif_queue_set_napi() yet. Just ice, e1000,
> and e1000e use it. But that is on us to fix.
igc also supports it ;)
I tried to add support to i40e some time ago, but ran into some
issues and didn't hear back, so I gave up on i40e.
In case my previous attempt is helpful for anyone at Intel, see [1].
[1]: https://lore.kernel.org/lkml/20240410043936.206169-1-jdamato@fastly.com/
Powered by blists - more mailing lists