[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65fb1e02-563f-4323-8bc4-a81a6e6ad09e@redhat.com>
Date: Tue, 5 Aug 2025 15:24:27 +0200
From: David Hildenbrand <david@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Alex Williamson <alex.williamson@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lizhe.67@...edance.com" <lizhe.67@...edance.com>,
Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [GIT PULL] VFIO updates for v6.17-rc1
On 05.08.25 15:20, David Hildenbrand wrote:
> On 05.08.25 15:00, Linus Torvalds wrote:
>> On Tue, 5 Aug 2025 at 10:47, David Hildenbrand <david@...hat.com> wrote:
>>>
>>> The concern is rather false positives, meaning, you want consecutive
>>> PFNs (just like within a folio), but -- because the stars aligned --
>>> you get consecutive "struct page" that do not translate to consecutive PFNs.
>>
>> So I don't think that can happen with a valid 'struct page', because
>> if the 'struct page's are in different sections, they will have been
>> allocated separately too.
>
> I think you can end up with two memory sections not being consecutive,
> but the struct pages being consecutive.
>
> I assume the easiest way to achieve that is having a large-enough memory
> hole (PCI hole?) that spans more than section, and memblock just giving
> you the next PFN range to use as "struct page".
>
> It's one allocation per memory section, see
>
> sparse_init_nid()->__populate_section_memmap(prn, PAGES_PER_SECTION) ->
> memmap_alloc()->memblock_alloc().
>
> With memory hotplug, there might be other weird ways to achieve it I
> suspect.
>
>>
>> So you can't have two consecutive 'struct page' things without them
>> being consecutive pages.
>>
>> But by all means, if you want to make sure, just compare the page
>> sections. But converting them to a PFN and then converting back is
>> just crazy.
> > > IOW, the logic would literally be something like (this assumes there
>> is always at least *one* page):
>>
>> struct page *page = *pages++;
>> int section = page_to_section(page);
>>
>> for (size_t nr = 1; nr < nr_pages; nr++) {
>> if (*pages++ != ++page)
>> break;
>> if (page_to_section(page) != section)
>> break;
>> }
>> return nr;
>
> I think that would work, and we could limit the section check to the
> problematic case only (sparsemem without VMEMMAP).
>
>>
>> and yes, I think we only define page_to_section() for
>> SECTION_IN_PAGE_FLAGS, but we should fix that and just have a
>>
>> #define page_to_section(pg) 0
>
> Probably yes, have to think about that.
Ah,
#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
#define SECTION_IN_PAGE_FLAGS
#endif
so it will already be optimized out with your proposal, great!
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists