[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210122093027.33b2e8e7@ceranb>
Date: Fri, 22 Jan 2021 09:30:27 +0100
From: Ivan Vecera <ivecera@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Cong Wang <xiyou.wangcong@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
saeed@...nel.org, Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH net] team: postpone features update to avoid deadlock
On Thu, 21 Jan 2021 18:34:52 -0800
Jakub Kicinski <kuba@...nel.org> wrote:
> On Thu, 21 Jan 2021 11:29:37 +0100 Ivan Vecera wrote:
> > On Wed, 20 Jan 2021 15:18:20 -0800
> > Cong Wang <xiyou.wangcong@...il.com> wrote:
> > > On Wed, Jan 20, 2021 at 4:56 AM Ivan Vecera <ivecera@...hat.com> wrote:
> > > > Team driver protects port list traversal by its team->lock mutex
> > > > in functions like team_change_mtu(), team_set_rx_mode(),
>
> The set_rx_mode part can't be true, set_rx_mode can't sleep and
> team->lock is a mutex.
>
> > > > To fix the problem __team_compute_features() needs to be postponed
> > > > for these cases.
> > >
> > > Is there any user-visible effect after deferring this feature change?
> >
> > An user should not notice this change.
>
> I think Cong is right, can you expand a little on your assertion?
> User should be able to assume that the moment syscall returns the
> features had settled.
>
> What does team->mutex actually protect in team_compute_features()?
> All callers seem to hold RTNL at a quick glance. This is a bit of
> a long shot but isn't it just tryin to protect the iteration over
> ports which could be under RCU?
In fact the mutex could be removed at all because all port-list
writers are running under rtnl_lock, some readers like team_change_mtu()
or team_device_event() [notifier] as well and hot path readers are
protected by RCU.
I have discussed this with Jiri but he don't want to introduce any dependency
on RTNL to team as it was designed as RTNL-independent from beginning.
Anyway your idea to run team_compute_features under RCU could be fine
as subsequent __team_compute_features() cannot sleep...
Do you mean something like this?
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index c19dac21c468..dd7917cab2b1 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -992,7 +992,8 @@ static void __team_compute_features(struct team *team)
unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |
IFF_XMIT_DST_RELEASE_PERM;
- list_for_each_entry(port, &team->port_list, list) {
+ rcu_read_lock();
+ list_for_each_entry_rcu(port, &team->port_list, list) {
vlan_features = netdev_increment_features(vlan_features,
port->dev->vlan_features,
TEAM_VLAN_FEATURES);
@@ -1006,6 +1007,7 @@ static void __team_compute_features(struct team *team)
if (port->dev->hard_header_len > max_hard_header_len)
max_hard_header_len = port->dev->hard_header_len;
}
+ rcu_read_unlock();
team->dev->vlan_features = vlan_features;
team->dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL |
@@ -1020,9 +1022,7 @@ static void __team_compute_features(struct team *team)
static void team_compute_features(struct team *team)
{
- mutex_lock(&team->lock);
__team_compute_features(team);
- mutex_unlock(&team->lock);
netdev_change_features(team->dev);
}
Thanks for comments,
Ivan
Powered by blists - more mailing lists