[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260117005114.GC1134360@nvidia.com>
Date: Fri, 16 Jan 2026 20:51:14 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Matthew Brost <matthew.brost@...el.com>
Cc: Vlastimil Babka <vbabka@...e.cz>,
Francois Dugast <francois.dugast@...el.com>,
intel-xe@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
Zi Yan <ziy@...dia.com>, Alistair Popple <apopple@...dia.com>,
adhavan Srinivasan <maddy@...ux.ibm.com>,
Nicholas Piggin <npiggin@...il.com>,
Michael Ellerman <mpe@...erman.id.au>,
"Christophe Leroy (CS GROUP)" <chleroy@...nel.org>,
Felix Kuehling <Felix.Kuehling@....com>,
Alex Deucher <alexander.deucher@....com>,
Christian König <christian.koenig@....com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
Lyude Paul <lyude@...hat.com>, Danilo Krummrich <dakr@...nel.org>,
David Hildenbrand <david@...nel.org>,
Oscar Salvador <osalvador@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Leon Romanovsky <leon@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Balbir Singh <balbirs@...dia.com>,
linuxppc-dev@...ts.ozlabs.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, amd-gfx@...ts.freedesktop.org,
nouveau@...ts.freedesktop.org, linux-mm@...ck.org,
linux-cxl@...r.kernel.org
Subject: Re: [PATCH v6 1/5] mm/zone_device: Reinitialize large zone device
private folios
On Fri, Jan 16, 2026 at 12:31:25PM -0800, Matthew Brost wrote:
> > I suppose we could be getting say an order-9 folio that was previously used
> > as two order-8 folios? And each of them had their _nr_pages in their head
>
> Yes, this is a good example. At this point we have idea what previous
> allocation(s) order(s) were - we could have multiple places in the loop
> where _nr_pages is populated, thus we have to clear this everywhere.
Why? The fact you have to use such a crazy expression to even access
_nr_pages strongly says nothing will read it as _nr_pages.
Explain each thing:
new_page->flags.f &= ~0xffUL; /* Clear possible order, page head */
OK, the tail page flags need to be set right, and prep_compound_page()
called later depends on them being zero.
((struct folio *)(new_page - 1))->_nr_pages = 0;
Can't see a reason, nothing reads _nr_pages from a random tail
page. _nr_pages is the last 8 bytes of struct page so it overlaps
memcg_data, which is also not supposed to be read from a tail page?
new_folio->mapping = NULL;
Pointless, prep_compound_page() -> prep_compound_tail() -> p->mapping = TAIL_MAPPING;
new_folio->pgmap = pgmap; /* Also clear compound head */
Pointless, compound_head is set in prep_compound_tail(): set_compound_head(p, head);
new_folio->share = 0; /* fsdax only, unused for device private */
Not sure, certainly share isn't read from a tail page..
> > > Why can't this use the normal helpers, like memmap_init_compound()?
> > >
> > > struct folio *new_folio = page
> > >
> > > /* First 4 tail pages are part of struct folio */
> > > for (i = 4; i < (1UL << order); i++) {
> > > prep_compound_tail(..)
> > > }
> > >
> > > prep_comound_head(page, order)
> > > new_folio->_nr_pages = 0
> > >
> > > ??
>
> I've beat this to death with Alistair, normal helpers do not work here.
What do you mean? It already calls prep_compound_page()! The issue
seems to be that prep_compound_page() makes assumptions about what
values are in flags already?
So how about move that page flags mask logic into
prep_compound_tail()? I think that would help Vlastimil's
concern. That function is already touching most of the cache line so
an extra word shouldn't make a performance difference.
> An order zero allocation could have _nr_pages set in its page,
> new_folio->_nr_pages is page + 1 memory.
An order zero allocation does not have _nr_pages because it is in page
+1 memory that doesn't exist.
An order zero allocation might have memcg_data in the same slot, does
it need zeroing? If so why not add that to prep_compound_head() ?
Also, prep_compound_head() handles order 0 too:
if (IS_ENABLED(CONFIG_64BIT) || order > 1) {
atomic_set(&folio->_pincount, 0);
atomic_set(&folio->_entire_mapcount, -1);
}
if (order > 1)
INIT_LIST_HEAD(&folio->_deferred_list);
So some of the problem here looks to be not calling it:
if (order)
prep_compound_page(page, order);
So, remove that if ? Also shouldn't it be moved above the
set_page_count/lock_page ?
Jason
Powered by blists - more mailing lists