[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vrti6maahtwfrd6xrdmyupunprioodhl7x5alpi2r6kyi4qcyr@ga6a5yrdvmb2>
Date: Fri, 9 Jan 2026 11:41:51 +1100
From: Alistair Popple <apopple@...dia.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Hou Tao <houtao@...weicloud.com>, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, linux-mm@...ck.org, linux-nvme@...ts.infradead.org,
Bjorn Helgaas <bhelgaas@...gle.com>, Logan Gunthorpe <logang@...tatee.com>,
Leon Romanovsky <leonro@...dia.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Tejun Heo <tj@...nel.org>, "Rafael J . Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...nel.org>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>, houtao1@...wei.com
Subject: Re: [PATCH 01/13] PCI/P2PDMA: Release the per-cpu ref of pgmap when
vm_insert_page() fails
On 2026-01-09 at 02:55 +1100, Bjorn Helgaas <helgaas@...nel.org> wrote...
> On Thu, Jan 08, 2026 at 02:23:16PM +1100, Alistair Popple wrote:
> > On 2025-12-20 at 15:04 +1100, Hou Tao <houtao@...weicloud.com> wrote...
> > > From: Hou Tao <houtao1@...wei.com>
> > >
> > > When vm_insert_page() fails in p2pmem_alloc_mmap(), p2pmem_alloc_mmap()
> > > doesn't invoke percpu_ref_put() to free the per-cpu ref of pgmap
> > > acquired after gen_pool_alloc_owner(), and memunmap_pages() will hang
> > > forever when trying to remove the PCIe device.
> > >
> > > Fix it by adding the missed percpu_ref_put().
> >
> > This pairs with the percpu_ref_tryget_live_rcu() above right? Might
> > be worth mentioning that as a comment, but overall looks good to me
> > so feel free to add:
> >
> > Reviewed-by: Alistair Popple <apopple@...dia.com>
>
> Added your Reviewed-by, thanks!
>
> Would the following commit log address your suggestion?
>
> When the vm_insert_page() in p2pmem_alloc_mmap() failed, we did not
> invoke percpu_ref_put() to free the per-CPU pgmap ref acquired by
> percpu_ref_tryget_live_rcu(), which meant that PCI device removal would
> hang forever in memunmap_pages().
>
> Fix it by adding the missed percpu_ref_put().
Yes, that looks perfect. Thanks.
> Looking at this again, I'm confused about why in the normal, non-error
> case, we do the percpu_ref_tryget_live_rcu(ref), followed by another
> percpu_ref_get(ref) for each page, followed by just a single
> percpu_ref_put() at the exit.
>
> So we do ref_get() "1 + number of pages" times but we only do a single
> ref_put(). Is there a loop of ref_put() for each page elsewhere?
Right, the per-page ref_put() happens when the page is freed (ie. the struct
page refcount drops to zero) - in this case free_zone_device_folio() will call
p2pdma_folio_free() which has the corresponding percpu_ref_put().
It would be nice to harmonize the pgmap refcounting across all ZONE_DEVICE
users. For example for MEMORY_DEVICE_PRIVATE/COHERENT pages drop the reference
in the generic free_zone_device_folio() rather than in the specific free
callback. Although the whole thing is actually a bit redundant now and I have
debated removing it entirely - it really just serves as an optimised way to do
a sanity check that no pages are in use when memunmap_pages() is called. The
alternative would be just to check the refcount of every page.
> > > Fixes: 7e9c7ef83d78 ("PCI/P2PDMA: Allow userspace VMA allocations through sysfs")
> > > Signed-off-by: Hou Tao <houtao1@...wei.com>
> > > ---
> > > drivers/pci/p2pdma.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> > > index 4a2fc7ab42c3..218c1f5252b6 100644
> > > --- a/drivers/pci/p2pdma.c
> > > +++ b/drivers/pci/p2pdma.c
> > > @@ -152,6 +152,7 @@ static int p2pmem_alloc_mmap(struct file *filp, struct kobject *kobj,
> > > ret = vm_insert_page(vma, vaddr, page);
> > > if (ret) {
> > > gen_pool_free(p2pdma->pool, (uintptr_t)kaddr, len);
> > > + percpu_ref_put(ref);
> > > return ret;
> > > }
> > > percpu_ref_get(ref);
> > > --
> > > 2.29.2
> > >
>
Powered by blists - more mailing lists