netdev - Re: [PATCH net] team: postpone features update to avoid deadlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20210122100343.792005b9@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Fri, 22 Jan 2021 10:03:43 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Ivan Vecera <ivecera@...hat.com>
Cc:     Cong Wang <xiyou.wangcong@...il.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        saeed@...nel.org, Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH net] team: postpone features update to avoid deadlock

On Fri, 22 Jan 2021 09:30:27 +0100 Ivan Vecera wrote:
> On Thu, 21 Jan 2021 18:34:52 -0800
> Jakub Kicinski <kuba@...nel.org> wrote:
> 
> > On Thu, 21 Jan 2021 11:29:37 +0100 Ivan Vecera wrote:  
> > > On Wed, 20 Jan 2021 15:18:20 -0800
> > > Cong Wang <xiyou.wangcong@...il.com> wrote:    
> > > > On Wed, Jan 20, 2021 at 4:56 AM Ivan Vecera <ivecera@...hat.com> wrote:      
> > > > > Team driver protects port list traversal by its team->lock mutex
> > > > > in functions like team_change_mtu(), team_set_rx_mode(),    
> > 
> > The set_rx_mode part can't be true, set_rx_mode can't sleep and
> > team->lock is a mutex.
> >   
> > > > > To fix the problem __team_compute_features() needs to be postponed
> > > > > for these cases.        
> > > > 
> > > > Is there any user-visible effect after deferring this feature change?    
> > >
> > > An user should not notice this change.    
> > 
> > I think Cong is right, can you expand a little on your assertion?
> > User should be able to assume that the moment syscall returns the
> > features had settled.
> > 
> > What does team->mutex actually protect in team_compute_features()?
> > All callers seem to hold RTNL at a quick glance. This is a bit of 
> > a long shot but isn't it just tryin to protect the iteration over 
> > ports which could be under RCU?  
> 
> In fact the mutex could be removed at all because all port-list
> writers are running under rtnl_lock, some readers like team_change_mtu()
> or team_device_event() [notifier] as well and hot path readers are
> protected by RCU.
> I have discussed this with Jiri but he don't want to introduce any dependency
> on RTNL to team as it was designed as RTNL-independent from beginning.
> 
> Anyway your idea to run team_compute_features under RCU could be fine
> as subsequent __team_compute_features() cannot sleep...
> 
> Do you mean something like this?
> 
> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> index c19dac21c468..dd7917cab2b1 100644
> --- a/drivers/net/team/team.c
> +++ b/drivers/net/team/team.c
> @@ -992,7 +992,8 @@ static void __team_compute_features(struct team *team)
>         unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |
>                                         IFF_XMIT_DST_RELEASE_PERM;
>  
> -       list_for_each_entry(port, &team->port_list, list) {
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
>                 vlan_features = netdev_increment_features(vlan_features,
>                                         port->dev->vlan_features,
>                                         TEAM_VLAN_FEATURES);
> @@ -1006,6 +1007,7 @@ static void __team_compute_features(struct team *team)
>                 if (port->dev->hard_header_len > max_hard_header_len)
>                         max_hard_header_len = port->dev->hard_header_len;
>         }
> +       rcu_read_unlock();
>  
>         team->dev->vlan_features = vlan_features;
>         team->dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL |
> @@ -1020,9 +1022,7 @@ static void __team_compute_features(struct team *team)
>  
>  static void team_compute_features(struct team *team)
>  {
> -       mutex_lock(&team->lock);
>         __team_compute_features(team);
> -       mutex_unlock(&team->lock);
>         netdev_change_features(team->dev);
>  }

Yup, like this, but if Jiri doesn't like it then I guess we need to
come up with something else?

How about doing the work on unlock? Have some bit set when we had to
defer and then run __team_compute_features() before releasing the lock
for real?