[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210423135704.GC64904@ranger.igk.intel.com>
Date: Fri, 23 Apr 2021 15:57:04 +0200
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Martin KaFai Lau <kafai@...com>,
Hangbin Liu <liuhangbin@...il.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Magnus Karlsson <magnus.karlsson@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH RFC bpf-next 4/4] i40e: remove rcu_read_lock() around XDP
program invocation
On Fri, Apr 23, 2021 at 01:05:20PM +0200, Toke Høiland-Jørgensen wrote:
> From: Toke Høiland-Jørgensen <toke@...hat.com>
>
> The i40e driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
> program invocations. However, the actual lifetime of the objects referred
> by the XDP program invocation is longer, all the way through to the call to
> xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
> turns out to be harmless because it all happens in a single NAPI poll
> cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
> misleading.
Okay, but what about the lifetime of the xdp_prog itself? Can xdp_prog
change within a single NAPI poll? After reading previous discussions I
would say it can't, right?
There are drivers that have a big RCU critical section in NAPI poll, but it
seems that some read a xdp_prog a single time whereas others read it per
processed frame.
If we are sure that xdp_prog can't change on-the-fly then first low
hanging fruit, at least for the Intel drivers, is to skip a test against
NULL and read it only once at the beginning of NAPI poll. There might be
also other micro-optimizations specific to each drivers that could be done
based on that (that of course read the xdp_prog per each frame).
Or am I nuts?
>
> Rather than extend the scope of the rcu_read_lock(), just get rid of it
> entirely. With the addition of RCU annotations to the XDP_REDIRECT map
> types that take bh execution into account, lockdep even understands this to
> be safe, so there's really no reason to keep it around.
>
> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 --
> drivers/net/ethernet/intel/i40e/i40e_xsk.c | 6 +-----
> 2 files changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index fc20afc23bfa..3f4c947a5185 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -2303,7 +2303,6 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring,
> struct bpf_prog *xdp_prog;
> u32 act;
>
> - rcu_read_lock();
> xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>
> if (!xdp_prog)
> @@ -2334,7 +2333,6 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring,
> break;
> }
> xdp_out:
> - rcu_read_unlock();
> return ERR_PTR(-result);
> }
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index d89c22347d9d..93b349f11d3b 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -153,7 +153,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
> struct bpf_prog *xdp_prog;
> u32 act;
>
> - rcu_read_lock();
> /* NB! xdp_prog will always be !NULL, due to the fact that
> * this path is enabled by setting an XDP program.
> */
> @@ -162,9 +161,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
>
> if (likely(act == XDP_REDIRECT)) {
> err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
> - result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
> - rcu_read_unlock();
> - return result;
> + return !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
> }
>
> switch (act) {
> @@ -184,7 +181,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
> result = I40E_XDP_CONSUMED;
> break;
> }
> - rcu_read_unlock();
> return result;
> }
>
>
Powered by blists - more mailing lists