lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 15 May 2023 18:39:20 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, pabeni@...hat.com, 
	saeedm@...dia.com, leon@...nel.org, brouer@...hat.com, tariqt@...lanox.com
Subject: Re: [PATCH net] net/mlx5e: do as little as possible in napi poll when
 budget is 0

On Fri, May 12, 2023 at 4:57 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> NAPI gets called with budget of 0 from netpoll, which has interrupts
> disabled. We should try to free some space on Tx rings and nothing
> else.
>
> Specifically do not try to handle XDP TX or try to refill Rx buffers -
> we can't use the page pool from IRQ context. Don't check if IRQs moved,
> either, that makes no sense in netpoll. Netpoll calls _all_ the rings
> from whatever CPU it happens to be invoked on.
>
> In general do as little as possible, the work quickly adds up when
> there's tens of rings to poll.
>
> The immediate stack trace I was seeing is:
>
>     __do_softirq+0xd1/0x2c0
>     __local_bh_enable_ip+0xc7/0x120
>     </IRQ>
>     <TASK>
>     page_pool_put_defragged_page+0x267/0x320
>     mlx5e_free_xdpsq_desc+0x99/0xd0
>     mlx5e_poll_xdpsq_cq+0x138/0x3b0
>     mlx5e_napi_poll+0xc3/0x8b0
>     netpoll_poll_dev+0xce/0x150
>
> AFAIU page pool takes a BH lock, releases it and since BH is now
> enabled tries to run softirqs.
>
> Fixes: 60bbf7eeef10 ("mlx5: use page_pool for xdp_return_frame call")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> I'm pointing Fixes at where page_pool was added, although that's
> probably not 100% fair.
>
> CC: saeedm@...dia.com
> CC: leon@...nel.org
> CC: brouer@...hat.com
> CC: tariqt@...lanox.com
> ---
>  .../net/ethernet/mellanox/mlx5/core/en_txrx.c | 19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> index a50bfda18e96..bd4294dd72da 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
> @@ -161,20 +161,25 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
>                 }
>         }
>
> +       /* budget=0 means we may be in IRQ context, do as little as possible */
> +       if (unlikely(!budget)) {
> +               /* no work done, can't be asked to re-enable IRQs */
> +               WARN_ON_ONCE(napi_complete_done(napi, work_done));

This is not clear why you call napi_complete_done() here ?

Note the fine doc  ( https://www.kernel.org/doc/html/next/networking/napi.html )
says:

<quote>If the budget is 0 napi_complete_done() should never be called.</quote>



> +               goto out;
> +       }
> +
>         busy |= mlx5e_poll_xdpsq_cq(&c->xdpsq.cq);
>
>         if (c->xdp)
>                 busy |= mlx5e_poll_xdpsq_cq(&c->rq_xdpsq.cq);
>
> -       if (likely(budget)) { /* budget=0 means: don't poll rx rings */
> -               if (xsk_open)
> -                       work_done = mlx5e_poll_rx_cq(&xskrq->cq, budget);
> +       if (xsk_open)
> +               work_done = mlx5e_poll_rx_cq(&xskrq->cq, budget);
>
> -               if (likely(budget - work_done))
> -                       work_done += mlx5e_poll_rx_cq(&rq->cq, budget - work_done);
> +       if (likely(budget - work_done))
> +               work_done += mlx5e_poll_rx_cq(&rq->cq, budget - work_done);
>
> -               busy |= work_done == budget;
> -       }
> +       busy |= work_done == budget;
>
>         mlx5e_poll_ico_cq(&c->icosq.cq);
>         if (mlx5e_poll_ico_cq(&c->async_icosq.cq))
> --
> 2.40.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ