[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250618174641.GB1629589@nvidia.com>
Date: Wed, 18 Jun 2025 14:46:41 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Peter Xu <peterx@...hat.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
kvm@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
Alex Williamson <alex.williamson@...hat.com>,
Zi Yan <ziy@...dia.com>, Alex Mastro <amastro@...com>,
David Hildenbrand <david@...hat.com>,
Nico Pache <npache@...hat.com>
Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED
mappings
On Wed, Jun 18, 2025 at 12:56:01PM -0400, Peter Xu wrote:
> So I changed my mind, slightly. I can still have the "order" parameter to
> make the API cleaner (even if it'll be a pure overhead.. because all
> existing caller will pass in PUD_SIZE as of now),
That doesn't seem right, the callers should report the real value not
artifically cap it.. Like ARM does have page sizes greater than PUD
that might be interesting to enable someday for PFN users.
> but I think I'll still
> stick with the ifdef in patch 4, as I mentioned here:
> https://lore.kernel.org/all/aFGMG3763eSv9l8b@x1.local/
>
> The problem is I just noticed yet again that exporting
> huge_mapping_get_va_aligned() for all configs doesn't make sense. At least
> it'll need something like this to make !MMU compile for VFIO, while this is
> definitely some ugliness I also want to avoid..
IMHO this uglyness should certainly be contained to the mm code and not
leak into drivers.
> There's just no way to provide a sane default value for !MMU.
So all this mess seems to say that get_unmapped_area() is just the
wrong fop to have here. It can't be implemented sanely for !MMU and
has these weird conditions, like can't fail.
I again suggest to just simplify and add an new fop
size_t get_best_mapping_order(struct file *filp, pgoff_t pgoff,
size_t length);
Which will return the largest pgoff aligned order within pgoff/length
that the FD could try to install. Very simple for the driver
side. vfio pci will just return ilog2(bar_size).
PAGE_SHIFT can be a safe default.
Then put all this maze of conditionals in the mm side replacing the
call to fops->get_unmapped_area() and don't export anything new. The
mm will automaticall cap the alignment based on what the architecture
can do and what
!MMU would simply entirely ignore this new stuff.
> So going one step back: huge_mapping_get_va_aligned() (or whatever name we
> prefer) doesn't make sense to be exported always, but only when CONFIG_MMU.
> It should follow the same way we treat mm_get_unmapped_area().
We just deleted !SMP, I really wonder if it is time for !MMU to go
away too..
Jason
Powered by blists - more mailing lists