[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y3557Ecr80Y9ZD2z@google.com>
Date: Wed, 23 Nov 2022 11:52:12 -0800
From: sdf@...gle.com
To: Jakub Kicinski <kuba@...nel.org>
Cc: "Toke Høiland-Jørgensen" <toke@...hat.com>,
bpf@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org,
yhs@...com, john.fastabend@...il.com, kpsingh@...nel.org,
haoluo@...gle.com, jolsa@...nel.org,
Tariq Toukan <tariqt@...dia.com>,
David Ahern <dsahern@...il.com>,
Willem de Bruijn <willemb@...gle.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Anatoly Burakov <anatoly.burakov@...el.com>,
Alexander Lobakin <alexandr.lobakin@...el.com>,
Magnus Karlsson <magnus.karlsson@...il.com>,
Maryam Tahhan <mtahhan@...hat.com>, xdp-hints@...-project.net,
netdev@...r.kernel.org
Subject: Re: [xdp-hints] [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff
wrapper for xdp_buff
On 11/23, Jakub Kicinski wrote:
> On Wed, 23 Nov 2022 10:26:41 -0800 Stanislav Fomichev wrote:
> > > This embedding trick works for drivers that put xdp_buff on the stack,
> > > but mlx5 supports XSK zerocopy, which uses the xsk_buff_pool for
> > > allocating them. This makes it a bit awkward to do the same thing
> there;
> > > and since it's probably going to be fairly common to do something like
> > > this, how about we just add a 'void *drv_priv' pointer to struct
> > > xdp_buff that the drivers can use? The xdp_buff already takes up a
> full
> > > cache line anyway, so any data stuffed after it will spill over to a
> new
> > > one; so I don't think there's much difference performance-wise.
> >
> > I guess the alternative is to extend xsk_buff_pool with some new
> > argument for xdp_buff tailroom? (so it can kmalloc(sizeof(xdp_buff) +
> > xdp_buff_tailroom))
> > But it seems messy because there is no way of knowing what the target
> > device's tailroom is, so it has to be a user setting :-/
> > I've started with a priv pointer in xdp_buff initially, it seems fine
> > to go back. I'll probably convert veth/mlx4 to the same mode as well
> > to avoid having different approaches in different places..
> Can we not do this please? Add 16B of "private driver space" after
> the xdp_buff in xdp_buff_xsk (we have 16B to full cacheline), the
> drivers decide how they use it. Drivers can do BUILD_BUG_ON() for their
> expected size and cast that to whatever struct they want. This is how
> various offloads work, the variable size tailroom would be an over
> design IMO.
> And this way non XSK paths can keep its normal typing.
Good idea, prototyped below, lmk if it that's not what you had in mind.
struct xdp_buff_xsk {
struct xdp_buff xdp; /* 0 56 */
u8 cb[16]; /* 56 16 */
/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
dma_addr_t dma; /* 72 8 */
dma_addr_t frame_dma; /* 80 8 */
struct xsk_buff_pool * pool; /* 88 8 */
u64 orig_addr; /* 96 8 */
struct list_head free_list_node; /* 104 16 */
/* size: 120, cachelines: 2, members: 7 */
/* last cacheline: 56 bytes */
};
Toke, I can try to merge this into your patch + keep your SoB (or feel free
to try this and retest yourself, whatever works).
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index bc2d9034af5b..837bf103b871 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -44,6 +44,11 @@
(MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \
sizeof(struct mlx5_wqe_inline_seg))
+struct mlx5_xdp_cb {
+ struct mlx5_cqe64 *cqe;
+ struct mlx5e_rq *rq;
+};
+
struct mlx5e_xsk_param;
int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param
*xsk);
bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index c91b54d9ff27..84d23b2da7ce 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -5,6 +5,7 @@
#include "en/xdp.h"
#include <net/xdp_sock_drv.h>
#include <linux/filter.h>
+#include <linux/build_bug.h>
/* RX data path */
@@ -286,8 +287,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct
mlx5e_rq *rq,
u32 cqe_bcnt)
{
struct xdp_buff *xdp = wi->au->xsk;
+ struct mlx5_xdp_cb *cb;
struct bpf_prog *prog;
+ BUILD_BUG_ON(sizeof(struct mlx5_xdp_cb) > XSKB_CB_SIZE);
+ cb = xp_get_cb(xdp);
+ cb->cqe = NULL /*cqe*/;
+ cb->rq = rq;
+
/* wi->offset is not used in this function, because xdp->data and the
* DMA address point directly to the necessary place. Furthermore, the
* XSK allocator allocates frames per packet, instead of pages, so
diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index f787c3f524b0..b298590429e7 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -19,8 +19,11 @@ struct xdp_sock;
struct device;
struct page;
+#define XSKB_CB_SIZE 16
+
struct xdp_buff_xsk {
struct xdp_buff xdp;
+ u8 cb[XSKB_CB_SIZE]; /* Private area for the drivers to use. */
dma_addr_t dma;
dma_addr_t frame_dma;
struct xsk_buff_pool *pool;
@@ -143,6 +146,11 @@ static inline dma_addr_t xp_get_frame_dma(struct
xdp_buff_xsk *xskb)
return xskb->frame_dma;
}
+static inline void *xp_get_cb(struct xdp_buff *xdp)
+{
+ return (void *)xdp + offsetof(struct xdp_buff_xsk, cb);
+}
+
void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb);
static inline void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb)
{
> > > I'll send my patch to add support to mlx5 (using the drv_priv pointer
> > > approach) separately.
> >
> > Saw them, thanks! Will include them in v3+.
Powered by blists - more mailing lists