[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0a2e8593-47c6-4a17-b7b0-d4cb718b8f88@redhat.com>
Date: Tue, 5 Aug 2025 09:47:18 +0200
From: David Hildenbrand <david@...hat.com>
To: Alex Williamson <alex.williamson@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lizhe.67@...edance.com" <lizhe.67@...edance.com>,
Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [GIT PULL] VFIO updates for v6.17-rc1
On 05.08.25 02:53, Alex Williamson wrote:
> On Mon, 4 Aug 2025 16:55:09 -0700
> Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
>> On Mon, 4 Aug 2025 at 15:22, Alex Williamson <alex.williamson@...hat.com> wrote:
>>>
>>> Li Zhe (6):
>>> mm: introduce num_pages_contiguous()
>>
>> WHY?
>>
>> There is exactly *ONE* user, why the heck do we introduce this
>> completely pointless "helper" function, and put it in a core header
>> file like it was worth it?
>
> There was discussion here[1] where David Hildenbrand and Jason
> Gunthorpe suggested this should be in common code and I believe there
> was some intent that this would get reused. I took this as
> endorsement from mm folks. This can certainly be pulled back into
> subsystem code.
Yeah, we ended up here after trying to go the folio-way first, but then
realizing that code that called GUP shouldn't have to worry about
folios, just to detect consecutive pages+PFNs.
I think this helper will can come in handy even in folio context.
I recall pointing Joanne at it in different fuse context.
>
>> And it's not like that code is some kind of work of art that we want
>> to expose everybody to *anyway*. It's written in a particularly stupid
>> way that means that it's *way* more expensive than it needs to be.
>>
>> And then it's made "inline" despite the code generation being
>> horrible, which makes it all entirely pointless.
>>
>> Yes, I'm grumpy. This pull request came in late, I'm already
>> traveling, and then I look at it and it just makes me *angry* at how
>> bad that code is, and how annoying it is.
>
> Sorry, I usually try to get in later during the first week to let the
> dust settle a bit from the bigger subsystems, I guess I'm running a
> little behind this cycle. We'll get it fixed and I'll resend. Thanks,
>
> Alex
>
>> My builds are already slower than usual because they happen on my
>> laptop while traveling, I do *not* need to see this kind of absolutely
>> disgusting code that does stupid things that make the build even
>> slower.
>>
>> So I refuse to pull this kind of crap.
>>
>> If you insist on making my build slower and exposing these kinds of
>> helper functions, they had better be *good* helper functions.
>>
>> Hint: absolutely nobody cares about "the pages crossed a sparsemem
>> border. If your driver cares about the number of contiguous pages, it
>> might as well say "yeah, they are contiguous, but they are in
>> different sparsemem chunks, so we'll break here too".
The concern is rather false positives, meaning, you want consecutive
PFNs (just like within a folio), but -- because the stars aligned --
you get consecutive "struct page" that do not translate to consecutive PFNs.
So that's why the nth page stuff is not optional in the current
implementation as far as I can tell.
And because that nth_page stuff is so tricky and everyone gets it wrong
all the time, I am actually in favor of having such a helper around. not
buried in some subsystem.
>>
>> And at that point all you care about is 'struct page' being
>> contiguous, instead of doing that disgusting 'nth_page'.
I think stopping when we hit the end of a memory section in case of
!CONFIG_SPARSEMEM_VMEMMAP could be done and documented.
It's just that ... code gets more complicated: we end up really only
optimizing for the unloved child sparsemem withoutCONFIG_SPARSEMEM_VMEMMAP:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0d4ee569aa6b6..f080fa5a68d4a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1773,6 +1773,9 @@ static inline unsigned long page_to_section(const struct page *page)
* the memory model, this can mean that the addresses of the "struct page"s
* are not contiguous.
*
+ * On sparsemem configs without CONFIG_SPARSEMEM_VMEMMAP, we will stop once we
+ * hit a memory section boundary.
+ *
* @pages: an array of page pointers
* @nr_pages: length of the array
*/
@@ -1781,8 +1784,16 @@ static inline unsigned long num_pages_contiguous(struct page **pages,
{
size_t i;
+ if (IS_ENABLED(CONFIG_SPARSEMEM) &&
+ !IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) {
+ const unsigned long pfn = page_to_pfn(pages[0]);
+
+ nr_pages = min_t(size_t, nr_pages,
+ PAGES_PER_SECTION - (pfn & PAGES_PER_SECTION));
+ }
+
for (i = 1; i < nr_pages; i++)
- if (pages[i] != nth_page(pages[0], i))
+ if (pages[i] != pages[i - 1] + 1)
break;
Whether that helper should live in mm/utils.c is a valid point the bigger it gets.
>>
>> And then - since there is only *one* single user - you don't put it in
>> the most central header file that EVERYBODY ELSE cares about.
>>
>> And you absolutely don't do it if it generates garbage code for no good reason!
I understand the hate for nth_page in this context.
But I rather hate it *completely* because people get it wrong all of the time.
In any case, enjoy you travel Linus.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists