[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <040bc4de947fc4cca74dcad89464c5b714c5949d.camel@kernel.org>
Date: Tue, 18 May 2021 12:23:54 -0700
From: Saeed Mahameed <saeed@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>, eric.dumazet@...il.com
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH net] mlx5e: add add missing BH locking around
napi_schdule()
On Wed, 2021-05-05 at 13:20 -0700, Jakub Kicinski wrote:
> It's not correct to call napi_schedule() in pure process
> context. Because we use __raise_softirq_irqoff() we require
> callers to be in a context which will eventually lead to
> softirq handling (hardirq, bh disabled, etc.).
>
> With code as is users will see:
>
> NOHZ tick-stop error: Non-RCU local softirq work is pending, handler
> #08!!!
>
> Fixes: a8dd7ac12fc3 ("net/mlx5e: Generalize RQ activation")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> We may want to patch net-next once it opens to switch
> from __raise_softirq_irqoff() to raise_softirq_irqoff().
> The irq_count() check is probably negligable and we'd need
> to split the hardirq / non-hardirq paths completely to
> keep the current behaviour. Plus what's hardirq is murky
> with RT enabled..
>
> Eric WDYT?
>
I was waiting for Eric to reply, Anyway i think this patch is correct
as is,
Jakub do you want me to submit to net via net-mlx5 branch?
Another valid solution is that driver will avoid calling
napi_schedule() altogether from process context, we have the
mechanism of mlx5e_trigger_irq(), which can be utilized here, but needs
some re-factoring to move the icosq object from the main rx rq to the
containing channel object.
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index bca832cdc4cb..11e50f5b3a1e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -889,10 +889,13 @@ int mlx5e_open_rq(struct mlx5e_params *params,
> struct mlx5e_rq_param *param,
> void mlx5e_activate_rq(struct mlx5e_rq *rq)
> {
> set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
> - if (rq->icosq)
> + if (rq->icosq) {
> mlx5e_trigger_irq(rq->icosq);
> - else
> + } else {
> + local_bh_disable();
> napi_schedule(rq->cq.napi);
> + local_bh_enable();
> + }
> }
>
> void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
Powered by blists - more mailing lists