[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrZJqd8FBLU_GqFH@x1n>
Date: Fri, 9 Aug 2024 12:54:01 -0400
From: Peter Xu <peterx@...hat.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Sean Christopherson <seanjc@...gle.com>,
Oscar Salvador <osalvador@...e.de>,
Jason Gunthorpe <jgg@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
linux-arm-kernel@...ts.infradead.org, x86@...nel.org,
Will Deacon <will@...nel.org>, Gavin Shan <gshan@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>, Zi Yan <ziy@...dia.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Catalin Marinas <catalin.marinas@....com>,
Ingo Molnar <mingo@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>, kvm@...r.kernel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Alex Williamson <alex.williamson@...hat.com>,
Yan Zhao <yan.y.zhao@...el.com>
Subject: Re: [PATCH 06/19] mm/pagewalk: Check pfnmap early for
folio_walk_start()
On Fri, Aug 09, 2024 at 06:20:06PM +0200, David Hildenbrand wrote:
> On 09.08.24 18:08, Peter Xu wrote:
> > Pfnmaps can always be identified with special bits in the ptes/pmds/puds.
> > However that's unnecessary if the vma is stable, and when it's mapped under
> > VM_PFNMAP | VM_IO.
> >
> > Instead of adding similar checks in all the levels for huge pfnmaps, let
> > folio_walk_start() fail even earlier for these mappings. It's also
> > something gup-slow already does, so make them match.
> >
> > Cc: David Hildenbrand <david@...hat.com>
> > Signed-off-by: Peter Xu <peterx@...hat.com>
> > ---
> > mm/pagewalk.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> > index cd79fb3b89e5..fd3965efe773 100644
> > --- a/mm/pagewalk.c
> > +++ b/mm/pagewalk.c
> > @@ -727,6 +727,11 @@ struct folio *folio_walk_start(struct folio_walk *fw,
> > p4d_t *p4dp;
> > mmap_assert_locked(vma->vm_mm);
> > +
> > + /* It has no folio backing the mappings at all.. */
> > + if (vma->vm_flags & (VM_IO | VM_PFNMAP))
> > + return NULL;
> > +
>
> That is in general not what we want, and we still have some places that
> wrongly hard-code that behavior.
>
> In a MAP_PRIVATE mapping you might have anon pages that we can happily walk.
>
> vm_normal_page() / vm_normal_page_pmd() [and as commented as a TODO,
> vm_normal_page_pud()] should be able to identify PFN maps and reject them,
> no?
Yep, I think we can also rely on special bit.
When I was working on this whole series I must confess I am already
confused on the real users of MAP_PRIVATE pfnmaps. E.g. we probably don't
need either PFNMAP for either mprotect/fork/... at least for our use case,
then VM_PRIVATE is even one step further.
Here I chose to follow gup-slow, and I suppose you meant that's also wrong?
If so, would it make sense we keep them aligned as of now, and change them
altogether? Or do you think we should just rely on the special bits?
And, just curious: is there any use case you're aware of that can benefit
from caring PRIVATE pfnmaps yet so far, especially in this path?
As far as I read, none of folio_walk_start() users so far should even
stumble on top of a pfnmap, share or private. But that's a fairly quick
glimps only. IOW, I was wondering whether I'm just over cautious here.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists