[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220513114553.GK1343366@nvidia.com>
Date: Fri, 13 May 2022 08:45:53 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@....com>
Cc: david@...hat.com, Felix.Kuehling@....com, linux-mm@...ck.org,
rcampbell@...dia.com, linux-ext4@...r.kernel.org,
linux-xfs@...r.kernel.org, amd-gfx@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, hch@....de, jglisse@...hat.com,
apopple@...dia.com, willy@...radead.org, akpm@...ux-foundation.org
Subject: Re: [PATCH v1 13/15] mm: handling Non-LRU pages returned by
vm_normal_pages
On Thu, May 12, 2022 at 05:33:44PM -0500, Sierra Guiza, Alejandro (Alex) wrote:
>
> On 5/11/2022 1:50 PM, Jason Gunthorpe wrote:
> > On Thu, May 05, 2022 at 04:34:36PM -0500, Alex Sierra wrote:
> >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 76e3af9639d9..892c4cc54dc2 100644
> > > +++ b/mm/memory.c
> > > @@ -621,6 +621,13 @@ struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> > > if (is_zero_pfn(pfn))
> > > return NULL;
> > > if (pte_devmap(pte))
> > > +/*
> > > + * NOTE: Technically this should goto check_pfn label. However, page->_mapcount
> > > + * is never incremented for device pages that are mmap through DAX mechanism
> > > + * using pmem driver mounted into ext4 filesystem. When these pages are unmap,
> > > + * zap_pte_range is called and vm_normal_page return a valid page with
> > > + * page_mapcount() = 0, before page_remove_rmap is called.
> > > + */
> > > return NULL;
> > ? Where does this series cause device coherent to be returned?
> In our case, device coherent pages could be obtained as a result of
> migration(Patches 6/7 of 15), ending up mapped in CPU page tables. Later on,
> these pages might need to be returned by get_user_pages or other callers
> through vm_normal_pages. Our approach in this series, is to handle
> device-coherent-managed pages returned by vm_normal_pages, inside each
> caller. EX. device coherent pages don’t support LRU lists, NUMA migration or
> THP.
> >
> > Wasn't the plan to not set pte_devmap() ?
>
> amdgpu does not set pte_devmap for our DEVICE_COHERENT pages. DEVMAP flags
> are set by drivers like virtio_fs or pmem, where MEMORY_DEVICE_FS_DAX type
> is used.
> This patch series deals with DEVICE_COHERENT pages. My understanding was,
> that the DAX code and DEVICE_GENERIC would be fixed up later by someone more
> familiar with it. Were you expecting that we'd fix the DAX usage of
> pte_devmap flags in this patch series as well?
No, I was just trying to find where the pages got inserted and
understand the comment above. I think the comment should be clarified
more like you explained:
New uers of ZONE_DEVICE will not set pte_devmap() and will have
refcounts incremented on their struct pages when they are inserted
into PTEs, thus they are safe to return here. Legacy ZONE_DEVICE
pages that set pte_devmap() do not have refcounts. ....
Jason
Powered by blists - more mailing lists