[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5f6ab402de93f0b675d19499490e8c99701b5cc.camel@mellanox.com>
Date: Fri, 31 May 2019 21:56:51 +0000
From: Saeed Mahameed <saeedm@...lanox.com>
To: "daniel@...earbox.net" <daniel@...earbox.net>,
"magnus.karlsson@...el.com" <magnus.karlsson@...el.com>,
"ast@...nel.org" <ast@...nel.org>,
"bjorn.topel@...el.com" <bjorn.topel@...el.com>,
Maxim Mikityanskiy <maximmi@...lanox.com>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"yhs@...com" <yhs@...com>,
"songliubraving@...com" <songliubraving@...com>,
Tariq Toukan <tariqt@...lanox.com>,
"kafai@...com" <kafai@...com>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"maciejromanfijalkowski@...il.com" <maciejromanfijalkowski@...il.com>,
"bsd@...com" <bsd@...com>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 00/16] AF_XDP infrastructure improvements and
mlx5e support
On Fri, 2019-05-24 at 12:18 +0200, Björn Töpel wrote:
> On 2019-05-24 11:35, Maxim Mikityanskiy wrote:
> > This series contains improvements to the AF_XDP kernel
> > infrastructure
> > and AF_XDP support in mlx5e. The infrastructure improvements are
> > required for mlx5e, but also some of them benefit to all drivers,
> > and
> > some can be useful for other drivers that want to implement AF_XDP.
> >
> >
[...]
>
> Maxim, this doesn't address the uapi concern we had on your v2.
> Please refer to Magnus' comment here [1].
>
> Please educate me why you cannot publish AF_XDP without the uapi
> change?
> It's an extension, right? If so, then existing XDP/AF_XDP program can
> use Mellanox ZC without your addition? It's great that Mellanox has a
> ZC
> capable driver, but the uapi change is a NAK.
>
> To reiterate; We'd like to get the queue setup/steering for AF_XDP
> correct. I, and Magnus, dislike this approach. It requires a more
> complicated XDP program, and is hard for regular users to understand.
>
>
Hi Bjorn and Magnus,
It is not clear to me why you don't like this approach, if anything,
this approach is addressing many concerns you raised about current
limited approach of re-using/"stealing" only regular RX rings for xsk
traffic !
for instance
1) xsk ring now has a unique id. (wasn't this the plan from the
beginning ?)
2) No RSS issues, only explicit steering rules got the the newly
created isolated xsk ring, default RSS is not affected regular RX rings
are still intact.
3) the new scheme is flexible and will allow as much xsk sockets as
needed, and can co-exist with regular rings.
4) We want to have a solution that will replace DPDK, having such
limitations of a limited number of RX rings and stealing from regular
rings, is really not a worthy design, just because some drivers do not
want to deal or don't know how to deal with creating dedicated
resources.
5) i think it is wrong to compare xsk rings with regular rings, xsk
rings are actually just a a device context that redirects traffic to a
special buffer space, there is no real memory buffers model behind it,
other than the rx/tx descriptors. (memory model is handled outside the
driver).
6) mlx5 is designed and optimized for such use cases (dedicated/unique
rx/tx rings for XDP), limiting us to current AF_XDP limitation without
allowing us to improve the AF_XDP design is really not fair.
the way i see it, this new extension is actually a generalization to
allow for more drivers support and AF_XDP flexibility.
if you have different ideas on how to implement the new design, please
provide your feedback and we will be more than happy to improve the
current implementation, but requesting to drop it, i think is not a
fair request.
Side note: Our task is to provide a scalable and flexible in-kernel XDP
solution so we can offer a valid replacement for DPDK and userspace
only solutions, I think we need to have a scheme where we allow an
unlimited number of xsk sockets/rings with full flow
separation/isolation between different user sockets/apps, the driver/hw
resources are really very cheap (as there is no buffer management) much
cheaper than allocating a full blown regular socket buffers.
Thanks,
Saeed.
> Thanks,
> Björn
>
> [1]
> https://lore.kernel.org/bpf/CAJ8uoz2UHk+5xPwz-STM9gkQZdm7r_=jsgaB0nF+mHgch=axPg@mail.gmail.com/
>
>
Powered by blists - more mailing lists