[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <0867167a-73b8-0735-78ce-0d984f7a80b5@linux.ibm.com>
Date: Mon, 17 Feb 2020 09:49:41 +0100
From: Frederic Barrat <fbarrat@...ux.ibm.com>
To: Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Cc: Andrew Donnellan <ajd@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH AUTOSEL 5.5 096/542] powerpc/powernv/ioda: Fix ref count
for devices with their own PE
Le 14/02/2020 à 16:41, Sasha Levin a écrit :
> From: Frederic Barrat <fbarrat@...ux.ibm.com>
>
> [ Upstream commit 05dd7da76986937fb288b4213b1fa10dbe0d1b33 ]
Hi,
Upstream commit 05dd7da76986937fb288b4213b1fa10dbe0d1b33 doesn't really
need to go to stable (any of 4.19, 5.4 and 5.5). While it's probably
safe, the patch replaces a refcount leak by another one, which makes
sense as part of the full series merged in 5.6-rc1, but isn't terribly
useful standalone on the current stable branches.
Fred
> The pci_dn structure used to store a pointer to the struct pci_dev, so
> taking a reference on the device was required. However, the pci_dev
> pointer was later removed from the pci_dn structure, but the reference
> was kept for the npu device.
> See commit 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary
> pcidev from pci_dn").
>
> We don't need to take a reference on the device when assigning the PE
> as the struct pnv_ioda_pe is cleaned up at the same time as
> the (physical) device is released. Doing so prevents the device from
> being released, which is a problem for opencapi devices, since we want
> to be able to remove them through PCI hotplug.
>
> Now the ugly part: nvlink npu devices are not meant to be
> released. Because of the above, we've always leaked a reference and
> simply removing it now is dangerous and would likely require more
> work. There's currently no release device callback for nvlink devices
> for example. So to be safe, this patch leaks a reference on the npu
> device, but only for nvlink and not opencapi.
>
> Signed-off-by: Frederic Barrat <fbarrat@...ux.ibm.com>
> Reviewed-by: Andrew Donnellan <ajd@...ux.ibm.com>
> Signed-off-by: Michael Ellerman <mpe@...erman.id.au>
> Link: https://lore.kernel.org/r/20191121134918.7155-2-fbarrat@linux.ibm.com
> Signed-off-by: Sasha Levin <sashal@...nel.org>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 4374836b033b4..67b836f102402 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1062,14 +1062,13 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
> return NULL;
> }
>
> - /* NOTE: We get only one ref to the pci_dev for the pdn, not for the
> - * pointer in the PE data structure, both should be destroyed at the
> - * same time. However, this needs to be looked at more closely again
> - * once we actually start removing things (Hotplug, SR-IOV, ...)
> + /* NOTE: We don't get a reference for the pointer in the PE
> + * data structure, both the device and PE structures should be
> + * destroyed at the same time. However, removing nvlink
> + * devices will need some work.
> *
> * At some point we want to remove the PDN completely anyways
> */
> - pci_dev_get(dev);
> pdn->pe_number = pe->pe_number;
> pe->flags = PNV_IODA_PE_DEV;
> pe->pdev = dev;
> @@ -1084,7 +1083,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
> pnv_ioda_free_pe(pe);
> pdn->pe_number = IODA_INVALID_PE;
> pe->pdev = NULL;
> - pci_dev_put(dev);
> return NULL;
> }
>
> @@ -1205,6 +1203,14 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
> struct pci_controller *hose = pci_bus_to_host(npu_pdev->bus);
> struct pnv_phb *phb = hose->private_data;
>
> + /*
> + * Intentionally leak a reference on the npu device (for
> + * nvlink only; this is not an opencapi path) to make sure it
> + * never goes away, as it's been the case all along and some
> + * work is needed otherwise.
> + */
> + pci_dev_get(npu_pdev);
> +
> /*
> * Due to a hardware errata PE#0 on the NPU is reserved for
> * error handling. This means we only have three PEs remaining
> @@ -1228,7 +1234,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
> */
> dev_info(&npu_pdev->dev,
> "Associating to existing PE %x\n", pe_num);
> - pci_dev_get(npu_pdev);
> npu_pdn = pci_get_pdn(npu_pdev);
> rid = npu_pdev->bus->number << 8 | npu_pdn->devfn;
> npu_pdn->pe_number = pe_num;
>
Powered by blists - more mailing lists