[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aUG5y60q03RedLwv@x1.local>
Date: Tue, 16 Dec 2025 14:58:03 -0500
From: Peter Xu <peterx@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: kvm@...r.kernel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Nico Pache <npache@...hat.com>, Zi Yan <ziy@...dia.com>,
Alex Mastro <amastro@...com>, David Hildenbrand <david@...hat.com>,
Alex Williamson <alex@...zbot.org>, Zhi Wang <zhiw@...dia.com>,
David Laight <david.laight.linux@...il.com>,
Yi Liu <yi.l.liu@...el.com>, Ankit Agrawal <ankita@...dia.com>,
Kevin Tian <kevin.tian@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with
!MAP_FIXED mappings
On Tue, Dec 16, 2025 at 03:01:31PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 16, 2025 at 11:01:00AM -0500, Peter Xu wrote:
> > Do we have any function that we can fetch the best mapping lower than a
> > specific order?
>
> I'm not aware of anything
Maybe I can introduce a per-arch helper for it, then. I'll see if I can
cover some tests from ARM side, or I'll enable x86_64 first so we can do it
in two steps.
>
> > > None of this logic should be in drivers.
> >
> > I still think it's the driver's decision to have its own macro controlling
> > the huge pfnmap behavior. I agree with you core mm can have it, I don't
> > see it blocks the driver not returning huge order if huge pfnmap is turned
> > off. VFIO-PCI currently indeed only depends directly on global THP
> > configs, but I don't see why it's strictly needed. So I think it's fine if
> > a driver (even if global THP enabled for pmd/pud) deselect huge pfnmap for
> > other reasons, then here the order returned can still always be PSIZE for
> > the driver. It's really not a huge deal to me.
>
> All these APIs should be around the idea that the driver just returns
> what it has and the core mm places it into ptes. There is not a good
> reason drivers should be overriding this logic or doing their own
> thing.
I'll make sure the driver will not need to consider size of mapping that
arch would support.
>
> > > Drivers shouldn't implement this alignment function without also
> > > implementing huge fault, it is pointless. Don't see a reason to add
> > > extra complexity.
> >
> > It's not implementing the order hint without huge fault. It's when both
> > are turned off in a kernel config.. then the order hint (even from driver
> > POV) shouldn't need to be reported.
>
> No, it should still all be the same the core code just won't call the
> function.
>
> > I don't know why you have so strong feeling on having a config check in
> > vfio-pci drivers is bad.
>
> It is leaking MM details into drivers that should not be in drivers.
To me it still makes perfect sense here to pair with huge_fault(), and it's
driver knowledge alone. It has nothing to do with leaking mm details.
I think I get your point above, maybe when the core mm fallback paths not
available yet we can mix things together. I'll see what I can do when
repost.
--
Peter Xu
Powered by blists - more mailing lists