[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74E5B871-3D33-4C75-8FD4-C5D5BE2182AD@vmware.com>
Date: Thu, 28 Jun 2018 21:15:46 +0000
From: Adit Ranadive <aditr@...are.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Neil Horman <nhorman@...driver.com>
CC: "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
pv-drivers <pv-drivers@...are.com>,
Doug Ledford <dledford@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] vmw_pvrdma: Release netdev when vmxnet3 module is removed
On 6/28/18, 1:37 PM, "Jason Gunthorpe" <jgg@...pe.ca> wrote:
> On Thu, Jun 28, 2018 at 03:45:26PM -0400, Neil Horman wrote:
> > On Thu, Jun 28, 2018 at 12:59:46PM -0600, Jason Gunthorpe wrote:
> > > On Thu, Jun 28, 2018 at 09:59:38AM -0400, Neil Horman wrote:
> > > > On repeated module load/unload cycles, its possible for the pvrmda
> > > > driver to encounter this crash:
<snip>
> > > > @@ -962,6 +982,7 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
> > > > }
> > > >
> > > > dev->netdev = pci_get_drvdata(pdev_net);
> > > > + dev_hold(dev->netdev);
That doesn't seem right. If the vmxnet3 driver isn't loaded at all or failed
to create a netdev, you would be requesting a hold on a NULL netdev. What if
you moved this to after the if(!dev->netdev) check?
> > > > pci_dev_put(pdev_net);
> > > > if (!dev->netdev) {
> > > > dev_err(&pdev->dev, "failed to get vmxnet3 device\n");
> > >
> > > I see a lot of new dev_hold's here, where are the matching
> > > dev_puts()?
> > >
> I'm not sure I'd call 2 alot, but sure, there is a new dev_hold in the
> pvrdma_pci_probe routine, to hold a reference to the netdev that is looked up
> there. It is balanced by the NETDEV_UNREGISTER case in
> pvrdma_netdevice_event_handle. The UNREGISTER clause is also balancing the
> NETDEV_REGISTER case of the hanlder that looks up the matching netdev should a
> new device be registered. Note that we will only hold a single device at a
> time, because a given pvrdma device only recongnizes a single vmxnet3 device
> (the one on function 0 of its own bus/device tuple).
>
> I don't see how the dev_hold in pvrdma_pci_probe is undone during
> error unwind (eg goto err_free_cq_ring)
>
> And I don't see how it is put when pvrdma_pci_remove() is called.
That's right. These seem missing as well.
Powered by blists - more mailing lists