[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240703160210.83667-1-aha310510@gmail.com>
Date: Thu, 4 Jul 2024 01:02:10 +0900
From: Jeongjun Park <aha310510@...il.com>
To: michal.kubiak@...el.com
Cc: aha310510@...il.com,
davem@...emloft.net,
edumazet@...gle.com,
jiri@...nulli.us,
kuba@...nel.org,
linux-kernel@...r.kernel.org,
netdev@...r.kernel.org,
pabeni@...hat.com,
syzbot+705c61d60b091ef42c04@...kaller.appspotmail.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> > CPU0 CPU1
> > ---- ----
> > lock(&rdev->wiphy.mtx);
> > lock(team->team_lock_key#4);
> > lock(&rdev->wiphy.mtx);
> > lock(team->team_lock_key#4);
> >
> > Deadlock occurs due to the above scenario. Therefore,
> > modify the code as shown in the patch below to prevent deadlock.
> >
> > Regards,
> > Jeongjun Park.
>
> The commit message should contain the patch description only (without
> salutations, etc.).
>
> >
> > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@...kaller.appspotmail.com
> > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > Signed-off-by: Jeongjun Park <aha310510@...il.com>
> > ---
> > drivers/net/team/team_core.c | 14 ++++++++------
> > 1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..3ac82df876b0 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > struct netlink_ext_ack *extack)
> > {
> > struct team *team = netdev_priv(dev);
> > - int err;
> > + int err, locked;
> >
> > - mutex_lock(&team->lock);
> > + locked = mutex_trylock(&team->lock);
> > err = team_port_add(team, port_dev, extack);
> > - mutex_unlock(&team->lock);
> > + if (locked)
> > + mutex_unlock(&team->lock);
>
> This is not correct usage of 'mutex_trylock()' API. In such a case you
> could as well remove the lock completely from that part of code.
> If "mutex_trylock()" returns false it means the mutex cannot be taken
> (because it was already taken by other thread), so you should not modify
> the resources that were expected to be protected by the mutex.
> In other words, there is a risk of modifying resources using
> "team_port_add()" by several threads at a time.
>
> >
> > if (!err)
> > netdev_change_features(dev);
> > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > {
> > struct team *team = netdev_priv(dev);
> > - int err;
> > + int err, locked;
> >
> > - mutex_lock(&team->lock);
> > + locked = mutex_trylock(&team->lock);
> > err = team_port_del(team, port_dev);
> > - mutex_unlock(&team->lock);
> > + if (locked)
> > + mutex_unlock(&team->lock);
>
> The same story as in case of "team_add_slave()".
>
> >
> > if (err)
> > return err;
> > --
> >
>
> The patch does not seem to be a correct solution to remove a deadlock.
> Most probably a synchronization design needs an inspection.
> If you really want to use "mutex_trylock()" API, please consider several
> attempts of taking the mutex, but never modify the protected resources when
> the mutex is not taken successfully.
>
Thanks for your comment. I rewrote the patch based on those comments.
This time, we modified it to return an error so that resources are not
modified when a race situation occurs. We would appreciate your
feedback on what this patch would be like.
> Thanks,
> Michal
>
>
Regards,
Jeongjun Park
---
drivers/net/team/team_core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
struct team *team = netdev_priv(dev);
int err;
- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_add(team, port_dev, extack);
mutex_unlock(&team->lock);
@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
struct team *team = netdev_priv(dev);
int err;
- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_del(team, port_dev);
mutex_unlock(&team->lock);
--
Powered by blists - more mailing lists