[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJgT49PKvwZKXShQXivayESxRWYOHC5tHC8CLwkTSwmZg@mail.gmail.com>
Date: Mon, 15 May 2023 18:39:20 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, pabeni@...hat.com,
saeedm@...dia.com, leon@...nel.org, brouer@...hat.com, tariqt@...lanox.com
Subject: Re: [PATCH net] net/mlx5e: do as little as possible in napi poll when
budget is 0
On Fri, May 12, 2023 at 4:57 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> NAPI gets called with budget of 0 from netpoll, which has interrupts
> disabled. We should try to free some space on Tx rings and nothing
> else.
>
> Specifically do not try to handle XDP TX or try to refill Rx buffers -
> we can't use the page pool from IRQ context. Don't check if IRQs moved,
> either, that makes no sense in netpoll. Netpoll calls _all_ the rings
> from whatever CPU it happens to be invoked on.
>
> In general do as little as possible, the work quickly adds up when
> there's tens of rings to poll.
>
> The immediate stack trace I was seeing is:
>
> __do_softirq+0xd1/0x2c0
> __local_bh_enable_ip+0xc7/0x120
> </IRQ>
> <TASK>
> page_pool_put_defragged_page+0x267/0x320
> mlx5e_free_xdpsq_desc+0x99/0xd0
> mlx5e_poll_xdpsq_cq+0x138/0x3b0
> mlx5e_napi_poll+0xc3/0x8b0
> netpoll_poll_dev+0xce/0x150
>
> AFAIU page pool takes a BH lock, releases it and since BH is now
> enabled tries to run softirqs.
>
> Fixes: 60bbf7eeef10 ("mlx5: use page_pool for xdp_return_frame call")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> I'm pointing Fixes at where page_pool was added, although that's
> probably not 100% fair.
>
> CC: saeedm@...dia.com
> CC: leon@...nel.org
> CC: brouer@...hat.com
> CC: tariqt@...lanox.com
> ---
> .../net/ethernet/mellanox/mlx5/core/en_txrx.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> index a50bfda18e96..bd4294dd72da 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> @@ -161,20 +161,25 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
> }
> }
>
> + /* budget=0 means we may be in IRQ context, do as little as possible */
> + if (unlikely(!budget)) {
> + /* no work done, can't be asked to re-enable IRQs */
> + WARN_ON_ONCE(napi_complete_done(napi, work_done));
This is not clear why you call napi_complete_done() here ?
Note the fine doc ( https://www.kernel.org/doc/html/next/networking/napi.html )
says:
<quote>If the budget is 0 napi_complete_done() should never be called.</quote>
> + goto out;
> + }
> +
> busy |= mlx5e_poll_xdpsq_cq(&c->xdpsq.cq);
>
> if (c->xdp)
> busy |= mlx5e_poll_xdpsq_cq(&c->rq_xdpsq.cq);
>
> - if (likely(budget)) { /* budget=0 means: don't poll rx rings */
> - if (xsk_open)
> - work_done = mlx5e_poll_rx_cq(&xskrq->cq, budget);
> + if (xsk_open)
> + work_done = mlx5e_poll_rx_cq(&xskrq->cq, budget);
>
> - if (likely(budget - work_done))
> - work_done += mlx5e_poll_rx_cq(&rq->cq, budget - work_done);
> + if (likely(budget - work_done))
> + work_done += mlx5e_poll_rx_cq(&rq->cq, budget - work_done);
>
> - busy |= work_done == budget;
> - }
> + busy |= work_done == budget;
>
> mlx5e_poll_ico_cq(&c->icosq.cq);
> if (mlx5e_poll_ico_cq(&c->async_icosq.cq))
> --
> 2.40.1
>
Powered by blists - more mailing lists