[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKPOu+97BQs6Y_nJ4ONsciSb1=OC_jZ7s9W4ftN-Z378Q9cZ0w@mail.gmail.com>
Date: Wed, 27 Aug 2025 17:38:59 +0200
From: Max Kellermann <max.kellermann@...os.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: akpm@...ux-foundation.org, david@...hat.com, ziy@...dia.com,
baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com, npache@...hat.com,
ryan.roberts@....com, dev.jain@....com, baohua@...nel.org,
shikemeng@...weicloud.com, kasong@...cent.com, nphamcs@...il.com,
bhe@...hat.com, chrisl@...nel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
On Wed, Aug 27, 2025 at 5:21 PM Lorenzo Stoakes
<lorenzo.stoakes@...cle.com> wrote:
> > But if somebody really passes NULL, the function should not return
> > true - this isn't the huge zero folio after all! However, if the
> > `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> > is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> Hmm seems like this is a bug under a bug. folio_put_refs() shouldn't be
> passed a folio batch of NULL's.
Agree! That was exactly my point - I was hunting down a bug that
sometimes caused folio_put_refs() to crash, but most of the time not
(when no zero huge page was allocated yet). And this randomness is
what I'd like to get rid of.
> Shouldn't we just put the VM_WARN_ON_ONCE() there?
Agree, but that was the 2/2 patch I dropped after David's objection.
> But I really don't think passing NULL to is_huge_zero_folio() is a valid
> enough situation to justify this?
>
> You've encountered a case where a bug caused folio_put_refs() to be called
> with an invalid parameter, then you're arbitrarily changing
> is_huge_zero_folio() so it would deref the folio and splat.
Actually, my v1 patch did not do that. Instead, it checked whether the
huge zero page was already allocated, in order to make
is_huge_zero_folio(NULL) to reliably return false, because NULL is not
the huge zero page. Then David disagreed and asked me to add
VM_WARN_ON_ONCE() instead.
> I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
> on what you've said.
You only disagree with David, but not with me. I'm happy with either
way of dealing with this kind of bug/abuse.
> > +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
>
> Please don't do //.
In Linux-main, there are currently 432 comments documenting #include
lines. This is a pretty common coding style.
> This include is suspect though, huge_mm.h is included from mm.h and thus
> this very easily might break some arch that is weird about this stuff,
> because a ton of stuff includes mm.h including things that might absolutely
> baulk at mmdebug.
What would you suggest doing instead, to make the VM_WARN_ON_ONCE()
macro available?
> I've had this kind of thing happen several times before.
I know, #includes in Linux are a big mess. A while ago, I tried to
help clean it up, but my effort was rejected by the kernel
maintainers. Which is a pity.
Powered by blists - more mailing lists