[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGeep=4CqW+z4K=hXf2A6V3aWZLi_XSeEuEz1v=S7qKnw@mail.gmail.com>
Date: Thu, 21 Mar 2024 10:22:46 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: akpm@...ux-foundation.org, kent.overstreet@...ux.dev, mhocko@...e.com,
vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
dave@...olabs.net, liam.howlett@...cle.com,
penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, jhubbard@...dia.com, tj@...nel.org,
muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com,
dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com,
gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, aliceryhl@...gle.com,
rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
kernel-team@...roid.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v6 20/37] mm: fix non-compound multi-order memory
accounting in __free_pages
On Thu, Mar 21, 2024 at 10:19 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> On Thu, Mar 21, 2024 at 10:04 AM Matthew Wilcox <willy@...radead.org> wrote:
> >
> > On Thu, Mar 21, 2024 at 04:48:53PM +0000, Matthew Wilcox wrote:
> > > On Thu, Mar 21, 2024 at 09:36:42AM -0700, Suren Baghdasaryan wrote:
> > > > +++ b/mm/page_alloc.c
> > > > @@ -4700,12 +4700,15 @@ void __free_pages(struct page *page, unsigned int order)
> > > > {
> > > > /* get PageHead before we drop reference */
> > > > int head = PageHead(page);
> > > > + struct alloc_tag *tag = pgalloc_tag_get(page);
> > > >
> > > > if (put_page_testzero(page))
> > > > free_the_page(page, order);
> > > > - else if (!head)
> > > > + else if (!head) {
> > > > + pgalloc_tag_sub_pages(tag, (1 << order) - 1);
> > > > while (order-- > 0)
> > > > free_the_page(page + (1 << order), order);
> > > > + }
> > >
> > > Why do you need these new functions instead of just:
> > >
> > > + else if (!head) {
> > > + pgalloc_tag_sub(page, (1 << order) - 1);
> > > while (order-- > 0)
> > > free_the_page(page + (1 << order), order);
> > > + }
> >
> > Actually, I'm not sure this is safe (I don't fully understand codetags,
> > so it may be safe). What can happen is that the put_page() can come in
> > before the pgalloc_tag_sub(), and then that page can be allocated again.
> > Will that cause confusion?
I indirectly answered your question in the reason #2 but to be clear,
we obtain codetag before we do put_page() here, therefore it's valid.
If another page is allocated and it points to the same codetag, then
it will operate on the same codetag per-cpu counters and that should
not be a problem.
>
> So, there are two reasons I unfortunately can't reuse pgalloc_tag_sub():
>
> 1. We need to subtract `bytes` counter from the codetag but not the
> `calls` counter, otherwise the final accounting will be incorrect.
> This is because we effectively allocated multiple pages with one call
> but freeing them with separate calls here. pgalloc_tag_sub_pages()
> subtracts bytes but keeps calls counter the same. I mentioned this in
> here: https://lore.kernel.org/all/CAJuCfpEgh1OiYNE_uKG-BqW2x97sOL9+AaTX4Jct3=WHzAv+kg@mail.gmail.com/
> 2. The codetag object itself is stable, it's created at build time.
> The exception is when we unload modules and the codetag section gets
> freed but during module unloading we check that all module codetags
> are not referenced anymore and we prevent unloading this section if
> any of them are still referenced (should not normally happen). That
> said, the reference to the codetag (in this case from the page_ext)
> might change from under us and we have to make sure it's valid. We
> ensure that here by getting the codetag itself with pgalloc_tag_get()
> *before* calling put_page_testzero(), which ensures its stability.
>
> >
Powered by blists - more mailing lists