[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <63364b20ee66d_7390294a1@dwillia2-mobl3.amr.corp.intel.com.notmuch>
Date: Thu, 29 Sep 2022 18:49:20 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Alistair Popple <apopple@...dia.com>,
Dan Williams <dan.j.williams@...el.com>
CC: Jason Gunthorpe <jgg@...dia.com>, <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Michael Ellerman <mpe@...erman.id.au>,
"Nicholas Piggin" <npiggin@...il.com>,
Felix Kuehling <Felix.Kuehling@....com>,
"Alex Deucher" <alexander.deucher@....com>,
Christian König <christian.koenig@....com>,
"Pan, Xinhui" <Xinhui.Pan@....com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Ben Skeggs <bskeggs@...hat.com>,
Karol Herbst <kherbst@...hat.com>,
Lyude Paul <lyude@...hat.com>,
Ralph Campbell <rcampbell@...dia.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Alex Sierra <alex.sierra@....com>,
"John Hubbard" <jhubbard@...dia.com>,
<linuxppc-dev@...ts.ozlabs.org>, <linux-kernel@...r.kernel.org>,
<amd-gfx@...ts.freedesktop.org>, <nouveau@...ts.freedesktop.org>,
<dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH 2/7] mm: Free device private pages have zero refcount
Alistair Popple wrote:
>
> Dan Williams <dan.j.williams@...el.com> writes:
>
> > Alistair Popple wrote:
> >>
> >> Jason Gunthorpe <jgg@...dia.com> writes:
> >>
> >> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
> >> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
> >> >> refcount") device private pages have no longer had an extra reference
> >> >> count when the page is in use. However before handing them back to the
> >> >> owning device driver we add an extra reference count such that free
> >> >> pages have a reference count of one.
> >> >>
> >> >> This makes it difficult to tell if a page is free or not because both
> >> >> free and in use pages will have a non-zero refcount. Instead we should
> >> >> return pages to the drivers page allocator with a zero reference count.
> >> >> Kernel code can then safely use kernel functions such as
> >> >> get_page_unless_zero().
> >> >>
> >> >> Signed-off-by: Alistair Popple <apopple@...dia.com>
> >> >> ---
> >> >> arch/powerpc/kvm/book3s_hv_uvmem.c | 1 +
> >> >> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
> >> >> drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 +
> >> >> lib/test_hmm.c | 1 +
> >> >> mm/memremap.c | 5 -----
> >> >> mm/page_alloc.c | 6 ++++++
> >> >> 6 files changed, 10 insertions(+), 5 deletions(-)
> >> >
> >> > I think this is a great idea, but I'm surprised no dax stuff is
> >> > touched here?
> >>
> >> free_zone_device_page() shouldn't be called for pgmap->type ==
> >> MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX
> >> there. Except that the folio code looks like it might have introduced a
> >> bug. AFAICT put_page() always calls
> >> put_devmap_managed_page(&folio->page) but folio_put() does not (although
> >> folios_put() does!). So it seems folio_put() won't end up calling
> >> __put_devmap_managed_page_refs() as I think it should.
> >>
> >> I think you're right about the change to __init_zone_device_page() - I
> >> should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to
> >> look at Dan's patch series more closely as I suspect it might be better
> >> to rebase this patch on top of that.
> >
> > Apologies for the delay I was travelling the past few days. Yes, I think
> > this patch slots in nicely to avoid the introduction of an init_mode
> > [1]:
> >
> > https://lore.kernel.org/nvdimm/166329940343.2786261.6047770378829215962.stgit@dwillia2-xfh.jf.intel.com/
> >
> > Mind if I steal it into my series?
>
> No problem, although I notice Andrew has already merged it into
> mm-unstable. If you end up rebasing your series on top of mine I think
> all that's needed is a patch somewhere in your series to drop the
> various `if (pgmap->type == MEMORY_DEVICE_*)` I added to (hopefully)
> avoid breaking DAX. Assuming DAX takes a pagemap reference on struct
> page allocation something like below.
Yeah, I'll go that route and rebase on top of -mm.
Thanks again.
Powered by blists - more mailing lists