[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4cdec7ba-373b-455a-b1d3-330246e8b2c8@lucifer.local>
Date: Fri, 7 Nov 2025 09:18:46 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Dev Jain <dev.jain@....com>
Cc: Nico Pache <npache@...hat.com>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-doc@...r.kernel.org, david@...hat.com, ziy@...dia.com,
baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
ryan.roberts@....com, corbet@....net, rostedt@...dmis.org,
mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
akpm@...ux-foundation.org, baohua@...nel.org, willy@...radead.org,
peterx@...hat.com, wangkefeng.wang@...wei.com, usamaarif642@...il.com,
sunnanyong@...wei.com, vishal.moola@...il.com,
thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
kas@...nel.org, aarcange@...hat.com, raquini@...hat.com,
anshuman.khandual@....com, catalin.marinas@....com, tiwai@...e.de,
will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz,
cl@...two.org, jglisse@...gle.com, surenb@...gle.com,
zokeefe@...gle.com, hannes@...xchg.org, rientjes@...gle.com,
mhocko@...e.com, rdunlap@...radead.org, hughd@...gle.com,
richard.weiyang@...il.com, lance.yang@...ux.dev, vbabka@...e.cz,
rppt@...nel.org, jannh@...gle.com, pfalcato@...e.de
Subject: Re: [PATCH v12 mm-new 07/15] khugepaged: generalize
collapse_huge_page for mTHP collapse
On Fri, Nov 07, 2025 at 08:39:03AM +0530, Dev Jain wrote:
> > ----------[snip]------------
PLease when you snip can you not snip way the code being referenced?
That's really unhelpful and now this sub-thread loses a ton of context...
> >
> > > +
> > > + spin_lock(pmd_ptl);
> > We're duplicating this in both branches, why not do outside if/else?
> >
> > > + WARN_ON_ONCE(!pmd_none(*pmd));
> > Hmm so the PMD entry will still always be empty on mTHP collapse? Surely we
> > could be collapsing more than one mTHP into an existing PTE table no? I may
> > be missing something here/confused :)
>
> After this code path isolates the PTE table, we don't want any other code path
> doing "Hey, I see an empty PMD, let's install a PTE table here". One of the
> reasons why all the heavy locking is required here.
That wasn't the question, the question was why are not able to install mTHP
entries in an existing PTE table.
I'm obviously aware that we need to lock here.
>
> Also, I want to ask a question about WARN vs BUG_ON: suppose that the
> race I described above occurs. After khugepaged isolates the PTE table, someone
> faults in a PTE table there, and eventually writes data in the underlying folios.
> Then the buggy khugepaged nukes out that table and installs a new one, installing
> an mTHP folio which had old data. How do we decide whether such a condition is
> worthy of a BUG_ON (leading to system crash) vs letting this pass with WARN?
To all intents and purposes just use a WARN_ON(). A BUG_ON() is almost
never right. This has been done to death.
Probably the WARN_ON() should be a VM_WARN_ON_ONCE() because this is
something that should simply not be happening in practice.
Or can make if (WARN_ON_ONCE(...)) abort, but then we complicate already
very complciated code.
>
>
> >
> > ------------[snip]----------
> >
Thanks, Lorenzo
Powered by blists - more mailing lists