[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZT1nSGYng8sUKQD7@boxer>
Date: Sat, 28 Oct 2023 21:55:52 +0200
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: Larysa Zaremba <larysa.zaremba@...el.com>
CC: <bpf@...r.kernel.org>, <ast@...nel.org>, <daniel@...earbox.net>,
<andrii@...nel.org>, <martin.lau@...ux.dev>, <song@...nel.org>, <yhs@...com>,
<john.fastabend@...il.com>, <kpsingh@...nel.org>, <sdf@...gle.com>,
<haoluo@...gle.com>, <jolsa@...nel.org>, David Ahern <dsahern@...il.com>,
Jakub Kicinski <kuba@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
Jesper Dangaard Brouer <hawk@...nel.org>, Anatoly Burakov
<anatoly.burakov@...el.com>, Alexander Lobakin <alexandr.lobakin@...el.com>,
Magnus Karlsson <magnus.karlsson@...il.com>, Maryam Tahhan
<mtahhan@...hat.com>, <xdp-hints@...-project.net>, <netdev@...r.kernel.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>, Alexei Starovoitov
<alexei.starovoitov@...il.com>, Tariq Toukan <tariqt@...lanox.com>, "Saeed
Mahameed" <saeedm@...lanox.com>, <toke@...nel.org>
Subject: Re: [PATCH bpf-next v6 11/18] ice: put XDP meta sources assignment
under a static key condition
On Mon, Oct 23, 2023 at 11:35:46AM +0200, Larysa Zaremba wrote:
> On Fri, Oct 20, 2023 at 06:32:13PM +0200, Maciej Fijalkowski wrote:
> > On Thu, Oct 12, 2023 at 07:05:17PM +0200, Larysa Zaremba wrote:
> > > Usage of XDP hints requires putting additional information after the
> > > xdp_buff. In basic case, only the descriptor has to be copied on a
> > > per-packet basis, because xdp_buff permanently resides before per-ring
> > > metadata (cached time and VLAN protocol ID).
> > >
> > > However, in ZC mode, xdp_buffs come from a pool, so memory after such
> > > buffer does not contain any reliable information, so everything has to be
> > > copied, damaging the performance.
> > >
> > > Introduce a static key to enable meta sources assignment only when attached
> > > XDP program is device-bound.
> > >
> > > This patch eliminates a 6% performance drop in ZC mode, which was a result
> > > of addition of XDP hints to the driver.
> > >
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@...el.com>
> > > ---
> > > drivers/net/ethernet/intel/ice/ice.h | 1 +
> > > drivers/net/ethernet/intel/ice/ice_main.c | 14 ++++++++++++++
> > > drivers/net/ethernet/intel/ice/ice_txrx.c | 3 ++-
> > > drivers/net/ethernet/intel/ice/ice_xsk.c | 3 +++
> > > 4 files changed, 20 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > > index 3d0f15f8b2b8..76d22be878a4 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > > @@ -210,6 +210,7 @@ enum ice_feature {
> > > };
> > >
> > > DECLARE_STATIC_KEY_FALSE(ice_xdp_locking_key);
> > > +DECLARE_STATIC_KEY_FALSE(ice_xdp_meta_key);
> > >
> > > struct ice_channel {
> > > struct list_head list;
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > > index 47e8920e1727..ee0df86d34b7 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > > @@ -48,6 +48,9 @@ MODULE_PARM_DESC(debug, "netif level (0=none,...,16=all)");
> > > DEFINE_STATIC_KEY_FALSE(ice_xdp_locking_key);
> > > EXPORT_SYMBOL(ice_xdp_locking_key);
> > >
> > > +DEFINE_STATIC_KEY_FALSE(ice_xdp_meta_key);
> > > +EXPORT_SYMBOL(ice_xdp_meta_key);
> > > +
> > > /**
> > > * ice_hw_to_dev - Get device pointer from the hardware structure
> > > * @hw: pointer to the device HW structure
> > > @@ -2634,6 +2637,11 @@ static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
> > > return -ENOMEM;
> > > }
> > >
> > > +static bool ice_xdp_prog_has_meta(struct bpf_prog *prog)
> > > +{
> > > + return prog && prog->aux->dev_bound;
> > > +}
> > > +
> > > /**
> > > * ice_vsi_assign_bpf_prog - set or clear bpf prog pointer on VSI
> > > * @vsi: VSI to set the bpf prog on
> > > @@ -2644,10 +2652,16 @@ static void ice_vsi_assign_bpf_prog(struct ice_vsi *vsi, struct bpf_prog *prog)
> > > struct bpf_prog *old_prog;
> > > int i;
> > >
> > > + if (ice_xdp_prog_has_meta(prog))
> > > + static_branch_inc(&ice_xdp_meta_key);
> >
> > i thought boolean key would be enough but inc/dec should serve properly
> > for example prog hotswap cases.
> >
>
> My thought process on using counting instead of boolean was: there can be
> several PFs that use the same driver, so therefore we need to keep track of how
> many od them use hints.
Very good point. This implies that if PF0 has hints-enabled prog loaded,
PF1 with non-hints prog will "suffer" from it.
Sorry for such a long delays in responses but I was having a hard time
making up my mind about it. In the end I have come up to some conclusions.
I know the timing for sending this response is not ideal, but I need to
get this off my chest and bring discussion back to life:)
IMHO having static keys to eliminate ZC overhead does not scale. I assume
every other driver would have to follow that.
XSK pool allows us to avoid initializing various things per each packet.
Instead, taking xdp_rxq_info as an example, each xdp_buff from pool has
xdp_rxq_info assigned at init time. With this in mind, we should have some
mechanism to set hints-specific things in xdp_buff_xsk::cb, at init time
as well. Such mechanism should not require us to expose driver's private
xdp_buff hints containers (such as ice_pkt_ctx) to XSK pool.
Right now you moved phctime down to ice_pkt_ctx and to me that's the main
reason we have to copy ice_pkt_ctx to each xdp_buff on ZC. What if we keep
the cached_phctime at original offset in ring but ice_pkt_ctx would get a
pointer to that?
This would allow us to init the pointer in each xdp_buff from XSK pool at
init time. I have come up with a way to program that via so called XSK
meta descriptors. Each desc would have data to write onto cb, offset
within cb and amount of bytes to write/copy.
I'll share the diff below but note that I didn't measure how much lower
the performance is degraded. My icelake machine where I used to measure
performance-sensitive code got broke. For now we can't escape initing
eop_desc per each xdp_buff, but I moved it to alloc side, as we mangle
descs there anyway.
I think mlx5 could benefit from that approach as well with initing the rq
ptr at init time.
Diff does mostly these things:
- move cached_phctime to old place in ice_rx_ring and add ptr to that in
ice_pkt_ctx
- introduce xsk_pool_set_meta()
- use it from ice side.
I consider this as a discussion trigger rather than ready code. Any
feedback will be appreciated.
---------------------------------8<---------------------------------
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index 7fa43827a3f0..c192e84bee55 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -519,6 +519,23 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring)
return 0;
}
+static void ice_xsk_pool_set_meta(struct ice_rx_ring *ring)
+{
+ struct xsk_meta_desc desc = {};
+
+ desc.val = (uintptr_t)&ring->cached_phctime;
+ desc.off = offsetof(struct ice_pkt_ctx, cached_phctime);
+ desc.bytes = sizeof_field(struct ice_pkt_ctx, cached_phctime);
+ xsk_pool_set_meta(ring->xsk_pool, &desc);
+
+ memset(&desc, 0, sizeof(struct xsk_meta_desc));
+
+ desc.val = ring->pkt_ctx.vlan_proto;
+ desc.off = offsetof(struct ice_pkt_ctx, vlan_proto);
+ desc.bytes = sizeof_field(struct ice_pkt_ctx, vlan_proto);
+ xsk_pool_set_meta(ring->xsk_pool, &desc);
+}
+
/**
* ice_vsi_cfg_rxq - Configure an Rx queue
* @ring: the ring being configured
@@ -553,6 +570,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
if (err)
return err;
xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq);
+ ice_xsk_pool_set_meta(ring);
dev_info(dev, "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n",
ring->q_index);
@@ -575,6 +593,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq);
ring->xdp.data = NULL;
+ ring->pkt_ctx.cached_phctime = &ring->cached_phctime;
err = ice_setup_rx_ctx(ring);
if (err) {
dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n",
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index cf5c91ada94c..d3cb08e66dcb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -2846,7 +2846,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
/* clone ring and setup updated count */
rx_rings[i] = *vsi->rx_rings[i];
rx_rings[i].count = new_rx_cnt;
- rx_rings[i].pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
+ rx_rings[i].cached_phctime = pf->ptp.cached_phc_time;
rx_rings[i].desc = NULL;
rx_rings[i].rx_buf = NULL;
/* this is to allow wr32 to have something to write to
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 2fc97eafd1f6..1f45f0c3963d 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1456,7 +1456,7 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi)
ring->netdev = vsi->netdev;
ring->dev = dev;
ring->count = vsi->num_rx_desc;
- ring->pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
+ ring->cached_phctime = pf->ptp.cached_phc_time;
WRITE_ONCE(vsi->rx_rings[i], ring);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index f6444890f0ef..e2fa979830cd 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -955,8 +955,7 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf)
ice_for_each_rxq(vsi, j) {
if (!vsi->rx_rings[j])
continue;
- WRITE_ONCE(vsi->rx_rings[j]->pkt_ctx.cached_phctime,
- systime);
+ WRITE_ONCE(vsi->rx_rings[j]->cached_phctime, systime);
}
}
clear_bit(ICE_CFG_BUSY, pf->state);
@@ -2119,7 +2118,7 @@ u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID))
return 0;
- cached_time = READ_ONCE(pkt_ctx->cached_phctime);
+ cached_time = READ_ONCE(*pkt_ctx->cached_phctime);
/* Do not report a timestamp if we don't have a cached PHC time */
if (!cached_time)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index 41e0b14e6643..94594cc0d3ee 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -259,7 +259,7 @@ enum ice_rx_dtype {
struct ice_pkt_ctx {
const union ice_32b_rx_flex_desc *eop_desc;
- u64 cached_phctime;
+ u64 *cached_phctime;
__be16 vlan_proto;
};
@@ -356,6 +356,7 @@ struct ice_rx_ring {
struct ice_tx_ring *xdp_ring;
struct xsk_buff_pool *xsk_pool;
dma_addr_t dma; /* physical address of ring */
+ u64 cached_phctime;
u16 rx_buf_len;
u8 dcb_tc; /* Traffic class of ring */
u8 ptp_rx;
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 49a64bfdd1f6..6fa7a86152d0 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -431,9 +431,18 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
return ret;
}
+static struct ice_xdp_buff *xsk_buff_to_ice_ctx(struct xdp_buff *xdp)
+{
+ /* xdp_buff pointer used by ZC code path is alloc as xdp_buff_xsk. The
+ * ice_xdp_buff shares its layout with xdp_buff_xsk and private
+ * ice_xdp_buff fields fall into xdp_buff_xsk->cb
+ */
+ return (struct ice_xdp_buff *)xdp;
+}
+
/**
* ice_fill_rx_descs - pick buffers from XSK buffer pool and use it
- * @pool: XSK Buffer pool to pull the buffers from
+ * @rx_ring: rx ring
* @xdp: SW ring of xdp_buff that will hold the buffers
* @rx_desc: Pointer to Rx descriptors that will be filled
* @count: The number of buffers to allocate
@@ -445,18 +454,21 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
*
* Returns the amount of allocated Rx descriptors
*/
-static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
+static u16 ice_fill_rx_descs(struct ice_rx_ring *rx_ring, struct xdp_buff **xdp,
union ice_32b_rx_flex_desc *rx_desc, u16 count)
{
+ struct ice_xdp_buff *ctx;
dma_addr_t dma;
u16 buffs;
int i;
- buffs = xsk_buff_alloc_batch(pool, xdp, count);
+ buffs = xsk_buff_alloc_batch(rx_ring->xsk_pool, xdp, count);
for (i = 0; i < buffs; i++) {
dma = xsk_buff_xdp_get_dma(*xdp);
rx_desc->read.pkt_addr = cpu_to_le64(dma);
rx_desc->wb.status_error0 = 0;
+ ctx = xsk_buff_to_ice_ctx(*xdp);
+ ctx->pkt_ctx.eop_desc = rx_desc;
rx_desc++;
xdp++;
@@ -488,8 +500,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
xdp = ice_xdp_buf(rx_ring, ntu);
if (ntu + count >= rx_ring->count) {
- nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
- rx_desc,
+ nb_buffs_extra = ice_fill_rx_descs(rx_ring, xdp, rx_desc,
rx_ring->count - ntu);
if (nb_buffs_extra != rx_ring->count - ntu) {
ntu += nb_buffs_extra;
@@ -502,7 +513,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
ice_release_rx_desc(rx_ring, 0);
}
- nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
+ nb_buffs = ice_fill_rx_descs(rx_ring, xdp, rx_desc, count);
ntu += nb_buffs;
if (ntu == rx_ring->count)
@@ -746,32 +757,6 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
return ICE_XDP_CONSUMED;
}
-/**
- * ice_prepare_pkt_ctx_zc - Prepare packet context for XDP hints
- * @xdp: xdp_buff used as input to the XDP program
- * @eop_desc: End of packet descriptor
- * @rx_ring: Rx ring with packet context
- *
- * In regular XDP, xdp_buff is placed inside the ring structure,
- * just before the packet context, so the latter can be accessed
- * with xdp_buff address only at all times, but in ZC mode,
- * xdp_buffs come from the pool, so we need to reinitialize
- * context for every packet.
- *
- * We can safely convert xdp_buff_xsk to ice_xdp_buff,
- * because there are XSK_PRIV_MAX bytes reserved in xdp_buff_xsk
- * right after xdp_buff, for our private use.
- * XSK_CHECK_PRIV_TYPE() ensures we do not go above the limit.
- */
-static void ice_prepare_pkt_ctx_zc(struct xdp_buff *xdp,
- union ice_32b_rx_flex_desc *eop_desc,
- struct ice_rx_ring *rx_ring)
-{
- XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
- ((struct ice_xdp_buff *)xdp)->pkt_ctx = rx_ring->pkt_ctx;
- ice_xdp_meta_set_desc(xdp, eop_desc);
-}
-
/**
* ice_run_xdp_zc - Executes an XDP program in zero-copy path
* @rx_ring: Rx ring
@@ -784,13 +769,11 @@ static void ice_prepare_pkt_ctx_zc(struct xdp_buff *xdp,
*/
static int
ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
- struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
- union ice_32b_rx_flex_desc *rx_desc)
+ struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring)
{
int err, result = ICE_XDP_PASS;
u32 act;
- ice_prepare_pkt_ctx_zc(xdp, rx_desc, rx_ring);
act = bpf_prog_run_xdp(xdp_prog, xdp);
if (likely(act == XDP_REDIRECT)) {
@@ -930,8 +913,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
if (ice_is_non_eop(rx_ring, rx_desc))
continue;
- xdp_res = ice_run_xdp_zc(rx_ring, first, xdp_prog, xdp_ring,
- rx_desc);
+ xdp_res = ice_run_xdp_zc(rx_ring, first, xdp_prog, xdp_ring);
if (likely(xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR))) {
xdp_xmit |= xdp_res;
} else if (xdp_res == ICE_XDP_EXIT) {
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 1f6fc8c7a84c..91fa74a14841 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -14,6 +14,13 @@
#ifdef CONFIG_XDP_SOCKETS
+struct xsk_meta_desc {
+ u64 val;
+ u8 off;
+ u8 bytes;
+};
+
+
void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries);
bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc);
u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max);
@@ -47,6 +54,12 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool,
xp_set_rxq_info(pool, rxq);
}
+static inline void xsk_pool_set_meta(struct xsk_buff_pool *pool,
+ struct xsk_meta_desc *desc)
+{
+ xp_set_meta(pool, desc);
+}
+
static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool)
{
#ifdef CONFIG_NET_RX_BUSY_POLL
@@ -250,6 +263,11 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool,
{
}
+static inline void xsk_pool_set_meta(struct xsk_buff_pool *pool,
+ struct xsk_meta_desc *desc)
+{
+}
+
static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool)
{
return 0;
diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index b0bdff26fc88..354b1c702a82 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -12,6 +12,7 @@
struct xsk_buff_pool;
struct xdp_rxq_info;
+struct xsk_meta_desc;
struct xsk_queue;
struct xdp_desc;
struct xdp_umem;
@@ -132,6 +133,7 @@ static inline void xp_init_xskb_dma(struct xdp_buff_xsk *xskb, struct xsk_buff_p
/* AF_XDP ZC drivers, via xdp_sock_buff.h */
void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq);
+void xp_set_meta(struct xsk_buff_pool *pool, struct xsk_meta_desc *desc);
int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev,
unsigned long attrs, struct page **pages, u32 nr_pages);
void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs);
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 49cb9f9a09be..632fdc247862 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -123,6 +123,18 @@ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq)
}
EXPORT_SYMBOL(xp_set_rxq_info);
+void xp_set_meta(struct xsk_buff_pool *pool, struct xsk_meta_desc *desc)
+{
+ u32 i;
+
+ for (i = 0; i < pool->heads_cnt; i++) {
+ struct xdp_buff_xsk *xskb = &pool->heads[i];
+
+ memcpy(xskb->cb + desc->off, desc->buf, desc->bytes);
+ }
+}
+EXPORT_SYMBOL(xp_set_meta);
+
static void xp_disable_drv_zc(struct xsk_buff_pool *pool)
{
struct netdev_bpf bpf;
--------------------------------->8---------------------------------
> And yes, this also looks better for hot-swapping,
> because conditions become more straightforward (we do not need to compare old
> and new programs).
>
> > > +
> > > old_prog = xchg(&vsi->xdp_prog, prog);
> > > ice_for_each_rxq(vsi, i)
> > > WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog);
> > >
> > > + if (ice_xdp_prog_has_meta(old_prog))
> > > + static_branch_dec(&ice_xdp_meta_key);
> > > +
> > > if (old_prog)
> > > bpf_prog_put(old_prog);
> > > }
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > index 4fd7614f243d..19fc182d1f4c 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > @@ -572,7 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > if (!xdp_prog)
> > > goto exit;
> > >
> > > - ice_xdp_meta_set_desc(xdp, eop_desc);
> > > + if (static_branch_unlikely(&ice_xdp_meta_key))
> >
> > My only concern is that we might be hurting in a minor way hints path now,
> > no?
>
> I have thought "unlikely" refers to the default state the code is compiled with
> and after static key incrementation this should be patched to "likely". Isn't
> this how static keys work?
I was only referring to that it ends with compiler hint:
#define unlikely_notrace(x) __builtin_expect(!!(x), 0)
see include/linux/jump_label.h
>
> >
> > > + ice_xdp_meta_set_desc(xdp, eop_desc);
> > >
> > > act = bpf_prog_run_xdp(xdp_prog, xdp);
> > > switch (act) {
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > index 39775bb6cec1..f92d7d33fde6 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > @@ -773,6 +773,9 @@ static void ice_prepare_pkt_ctx_zc(struct xdp_buff *xdp,
> > > union ice_32b_rx_flex_desc *eop_desc,
> > > struct ice_rx_ring *rx_ring)
> > > {
> > > + if (!static_branch_unlikely(&ice_xdp_meta_key))
> > > + return;
> >
> > wouldn't it be better to pull it out and avoid calling
> > ice_prepare_pkt_ctx_zc() unnecessarily?
> >
> > > +
> > > XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
> > > ((struct ice_xdp_buff *)xdp)->pkt_ctx = rx_ring->pkt_ctx;
> > > ice_xdp_meta_set_desc(xdp, eop_desc);
> > > --
> > > 2.41.0
> > >
Powered by blists - more mailing lists