[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <vmc7bu6muygheuepfltjvbbio6gvjemxostq4rjum66s4ok2f7@x7l3y7ot7mf4>
Date: Mon, 16 Jun 2025 12:49:27 +0200
From: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>
To: David Hildenbrand <david@...hat.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
Pankaj Raghav <p.raghav@...sung.com>, Suren Baghdasaryan <surenb@...gle.com>,
Ryan Roberts <ryan.roberts@....com>, Mike Rapoport <rppt@...nel.org>, Michal Hocko <mhocko@...e.com>,
Thomas Gleixner <tglx@...utronix.de>, Nico Pache <npache@...hat.com>, Dev Jain <dev.jain@....com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>, Vlastimil Babka <vbabka@...e.cz>, Zi Yan <ziy@...dia.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>, "Liam R . Howlett" <Liam.Howlett@...cle.com>,
Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
willy@...radead.org, x86@...nel.org, linux-block@...r.kernel.org,
linux-fsdevel@...r.kernel.org, "Darrick J . Wong" <djwong@...nel.org>, mcgrof@...nel.org,
gost.dev@...sung.com, hch@....de
Subject: Re: [PATCH 0/5] add STATIC_PMD_ZERO_PAGE config option
> > >
> > > The mm is a nice convenient place to stick an mm but there are other
> > > ways to keep an efficient refcount around. For instance, you could just
> > > bump a per-cpu refcount and then have the shrinker sum up all the
> > > refcounts to see if there are any outstanding on the system as a whole.
> > >
> > > I understand that the current refcounts are tied to an mm, but you could
> > > either replace the mm-specific ones or add something in parallel for
> > > when there's no mm.
> >
> > But the whole idea of allocating a static PMD page for sane
> > architectures like x86 started with the intent of avoiding the refcounts and
> > shrinker.
> >
> > This was the initial feedback I got[2]:
> >
> > I mean, the whole thing about dynamically allocating/freeing it was for
> > memory-constrained systems. For large systems, we just don't care.
>
> For non-mm usage we can just use the folio refcount. The per-mm refcounts
> are all combined into a single folio refcount. The way the global variable
> is managed based on per-mm refcounts is the weird thing.
>
> In some corner cases we might end up having multiple instances of huge zero
> folios right now. Just imagine:
>
> 1) Allocate huge zero folio during read fault
> 2) vmsplice() it
> 3) Unmap the huge zero folio
> 4) Shrinker runs and frees it
> 5) Repeat with 1)
>
> As long as the folio is vmspliced(), it will not get actually freed ...
>
> I would hope that we could remove the shrinker completely, and simply never
> free the huge zero folio once allocated. Or at least, only free it once it
> is actually no longer used.
>
Thanks for the explanation, David.
But I am still a bit confused on how to proceed with these patches.
So IIUC, our eventual goal is to get rid of the shrinker.
But do we still want to add a static PMD page in the .bss or do we take
an alternate approach here?
--
Pankaj
Powered by blists - more mailing lists