[<prev] [next>] [day] [month] [year] [list]
Message-ID: <Y5NA/oZ7RpeUXzJN@ziepe.ca>
Date: Fri, 9 Dec 2022 10:06:54 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Hillf Danton <hdanton@...a.com>
Cc: Leon Romanovsky <leon@...nel.org>,
syzbot <syzbot+3fd8326d9a0812d19218@...kaller.appspotmail.com>,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
markzhang@...dia.com, ohartoov@...dia.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] WARNING: refcount bug in nldev_newlink
On Fri, Dec 09, 2022 at 09:42:29PM +0800, Hillf Danton wrote:
> On 9 Dec 2022 09:01:14 -0400 Jason Gunthorpe <jgg@...pe.ca>
> > On Thu, Dec 08, 2022 at 11:14:39AM +0200, Leon Romanovsky wrote:
> >
> > > Jason, what do you think?
> >
> > No, the key to this report is that the refcount dec is inside the tracker:
> >
> > > > __refcount_dec include/linux/refcount.h:344 [inline]
> > > > refcount_dec include/linux/refcount.h:359 [inline]
> > > > ref_tracker_free+0x539/0x6b0 lib/ref_tracker.c:118
> > > > netdev_tracker_free include/linux/netdevice.h:4039 [inline]
> >
> > Which is not underflowing the refcount on the dev, it is actually
> > trying to say the tracker has become unbalanced.
> >
> > Eg this put is not matched with a hold that specified the tracker.
> >
> > Probably this:
> >
> > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> > index ff35cebb25e265..115b77c5e9a146 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -2192,6 +2192,7 @@ static void free_netdevs(struct ib_device *ib_dev)
> > if (ndev) {
> > spin_lock(&ndev_hash_lock);
> > hash_del_rcu(&pdata->ndev_hash_link);
> > + netdev_tracker_free(ndev, &pdata->netdev_tracker);
> > spin_unlock(&ndev_hash_lock);
> >
> > /*
> > @@ -2201,7 +2202,7 @@ static void free_netdevs(struct ib_device *ib_dev)
> > * comparisons after the put
> > */
> > rcu_assign_pointer(pdata->netdev, NULL);
> > - dev_put(ndev);
> > + __dev_put(ndev);
> > }
> > spin_unlock_irqrestore(&pdata->netdev_lock, flags);
> > }
>
> Wonder why this makes sense given rcu_assign_pointer(pdata->netdev, NULL)
> under pdata->netdev_lock.
Oh, yah, that is right, so we can just do the natural thing:
rcu_assign_pointer(pdata->netdev, NULL);
- dev_put(ndev);
+ netdev_put(ndev, &pdata->netdev_tracker);
Jason
Powered by blists - more mailing lists