lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Mar 2021 12:54:18 -0800
From:   Saeed Mahameed <saeed@...nel.org>
To:     Antoine Tenart <atenart@...nel.org>, davem@...emloft.net,
        kuba@...nel.org, alexander.duyck@...il.com,
        Tariq Toukan <tariqt@...dia.com>,
        Maxim Mykytianskyi <maximmi@...dia.com>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH net-next v3 15/16] net/mlx5e: take the rtnl lock when
 calling netif_set_xps_queue

On Fri, 2021-03-12 at 16:04 +0100, Antoine Tenart wrote:
> netif_set_xps_queue must be called with the rtnl lock taken, and this
> is
> now enforced using ASSERT_RTNL(). mlx5e_attach_netdev was taking the
> lock conditionally, fix this by taking the rtnl lock all the time.
> 
> Signed-off-by: Antoine Tenart <atenart@...nel.org>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index ec2fcb2a2977..96cba86b9f0d 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -5557,7 +5557,6 @@ static void mlx5e_update_features(struct
> net_device *netdev)
>  
>  int mlx5e_attach_netdev(struct mlx5e_priv *priv)
>  {
> -       const bool take_rtnl = priv->netdev->reg_state ==
> NETREG_REGISTERED;
>         const struct mlx5e_profile *profile = priv->profile;
>         int max_nch;
>         int err;
> @@ -5578,15 +5577,11 @@ int mlx5e_attach_netdev(struct mlx5e_priv
> *priv)
>          * 2. Set our default XPS cpumask.
>          * 3. Build the RQT.
>          *
> -        * rtnl_lock is required by netif_set_real_num_*_queues in case
> the
> -        * netdev has been registered by this point (if this function
> was called
> -        * in the reload or resume flow).
> +        * rtnl_lock is required by netif_set_xps_queue.
>          */

There is a reason why it is conditional:
we had a bug in the past of double locking here:

[ 4255.283960] echo/644 is trying to acquire lock:

 [ 4255.285092] ffffffff85101f90 (rtnl_mutex){+..}, at:
mlx5e_attach_netdev0xd4/0×3d0 [mlx5_core]

 [ 4255.287264] 

 [ 4255.287264] but task is already holding lock:

 [ 4255.288971] ffffffff85101f90 (rtnl_mutex){+..}, at:
ipoib_vlan_add0×7c/0×2d0 [ib_ipoib]

ipoib_vlan_add is called under rtnl and will eventually call 
mlx5e_attach_netdev, we don't have much control over this in mlx5
driver since the rdma stack provides a per-prepared netdev to attach to
our hw. maybe it is time we had a nested rtnl lock .. 

> -       if (take_rtnl)
> -               rtnl_lock();
> +       rtnl_lock();
>         err = mlx5e_num_channels_changed(priv);
> -       if (take_rtnl)
> -               rtnl_unlock();
> +       rtnl_unlock();
>         if (err)
>                 goto out;
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ