[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3b6d6e50-91ac-435e-adad-a67d4198a5b5@kernel.org>
Date: Thu, 4 Dec 2025 20:45:59 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Shuah Khan <skhan@...uxfoundation.org>
Cc: akpm@...ux-foundation.org, Alexander Deucher <Alexander.Deucher@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
amd-gfx@...ts.freedesktop.org, dri-devel <dri-devel@...ts.freedesktop.org>,
Guenter Roeck <linux@...ck-us.net>,
Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: Linux 6.18 amdgpu build error
On 12/4/25 20:36, Linus Torvalds wrote:
> On Thu, 4 Dec 2025 at 09:40, Shuah Khan <skhan@...uxfoundation.org> wrote:
>>
>> This commit has impact on all architectures, not a narrow scoped
>> powerpc only thing - it enables HAVE_GIGANTIC_FOLIOS on x86_64
>> and changes the common code that determines MAX_FOLIO_ORDER in
>> include/linux/mm.h
>
> So I suspect your bisection might not have worked out, and there might
> be two different things going on.
>
> In particular, hugepages were broken in 6.18-rc6 due to commit
> adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero
> folio").
>
> That was then fixed for rc7 (and obviously final 6.18) by commit
> 5bebe8de19264 ("mm/huge_memory: Fix initialization of huge zero
> folio"), but the breakage up until that time was a bit random.
>
> End result: if you ever ended up bisecting into that broken range
> between those two commits, you would get failures on some loads (but
> not reliably), and your bisection would end up pointing to some random
> thing.
>
> But as mentioned, that particular problem would have been fixed in rc7
> and in final 6.18, so any issues you saw with the final build would
> have been due to something else.
>
> Can I ask you to try to re-do the bisection, but with that commit
> 5bebe8de19264 applied by hand - if it wasn't already there - every
> time you build a kernel that has adfb6609c680?
Right, that's what I also proposed in [1].
I cannot make sense of how 39231e8d6ba could possibly trigger it given
that it only affects the value of MAX_FOLIO_ORDER --- which is primarily
used for safety checks and snapshot_page(), nothing that could explain
changed application behavior, really.
But while Shuah is retesting, I'll go have a yet another look.
[1]
https://lore.kernel.org/all/78af7da4-d213-42c6-8ca6-c2bdca81f233@linuxfoundation.org/
--
Cheers
David
Powered by blists - more mailing lists