[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b447a66-7dcb-442b-9d45-f0b14688aa8c@redhat.com>
Date: Tue, 5 Aug 2025 14:07:49 +0200
From: David Hildenbrand <david@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lizhe.67@...edance.com" <lizhe.67@...edance.com>
Subject: Re: [GIT PULL] VFIO updates for v6.17-rc1
On 05.08.25 13:49, Jason Gunthorpe wrote:
> On Tue, Aug 05, 2025 at 09:47:18AM +0200, David Hildenbrand wrote:
>>> There was discussion here[1] where David Hildenbrand and Jason
>>> Gunthorpe suggested this should be in common code and I believe there
>>> was some intent that this would get reused. I took this as
>>> endorsement from mm folks. This can certainly be pulled back into
>>> subsystem code.
>>
>> Yeah, we ended up here after trying to go the folio-way first, but then
>> realizing that code that called GUP shouldn't have to worry about
>> folios, just to detect consecutive pages+PFNs.
>>
>> I think this helper will can come in handy even in folio context.
>> I recall pointing Joanne at it in different fuse context.
>
> The scatterlist code should use it also, it is doing the same logic.
>
>> The concern is rather false positives, meaning, you want consecutive
>> PFNs (just like within a folio), but -- because the stars aligned --
>> you get consecutive "struct page" that do not translate to consecutive PFNs.
>
> I wonder if we can address that from the other side and prevent the
> memory code from creating a bogus contiguous struct page in the first
> place so that struct page contiguity directly reflects physical
> contiguity?
Well, if we could make CONFIG_SPARSEMEM_VMEMMAP the only sparsemem
option ... :) But I recall it's not that easy (e.g., 32bit).
I don't see an easy way to guarantee that. E.g., populate_section_memmap
really just does a kvmalloc_node() and
__populate_section_memmap()->memmap_alloc() a memblock_alloc().
So if the starts align, the "struct page" of the memory of two memory
sections are contiguous, although the memory sections are not contiguous.
Just imagine memory holes.
Also, I am not sure if that is really a problem worth solving right now.
If a helper on !CONFIG_SPARSEMEM_VMEMMAP is a little slower on some
operations, like nth_page(), I really don't particularly care.
Eventually, !CONFIG_SPARSEMEM_VMEMMAP will go away in some distant
future and we will all be happy.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists