[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHjfKYNyGeALZzwJ1k_AKOm_qcgKkx5zR+X6eyWmsZTLw@mail.gmail.com>
Date: Thu, 21 Mar 2024 10:19:28 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: akpm@...ux-foundation.org, kent.overstreet@...ux.dev, mhocko@...e.com,
vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
dave@...olabs.net, liam.howlett@...cle.com,
penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, jhubbard@...dia.com, tj@...nel.org,
muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com,
dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com,
gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, aliceryhl@...gle.com,
rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
kernel-team@...roid.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v6 20/37] mm: fix non-compound multi-order memory
accounting in __free_pages
On Thu, Mar 21, 2024 at 10:04 AM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Thu, Mar 21, 2024 at 04:48:53PM +0000, Matthew Wilcox wrote:
> > On Thu, Mar 21, 2024 at 09:36:42AM -0700, Suren Baghdasaryan wrote:
> > > +++ b/mm/page_alloc.c
> > > @@ -4700,12 +4700,15 @@ void __free_pages(struct page *page, unsigned int order)
> > > {
> > > /* get PageHead before we drop reference */
> > > int head = PageHead(page);
> > > + struct alloc_tag *tag = pgalloc_tag_get(page);
> > >
> > > if (put_page_testzero(page))
> > > free_the_page(page, order);
> > > - else if (!head)
> > > + else if (!head) {
> > > + pgalloc_tag_sub_pages(tag, (1 << order) - 1);
> > > while (order-- > 0)
> > > free_the_page(page + (1 << order), order);
> > > + }
> >
> > Why do you need these new functions instead of just:
> >
> > + else if (!head) {
> > + pgalloc_tag_sub(page, (1 << order) - 1);
> > while (order-- > 0)
> > free_the_page(page + (1 << order), order);
> > + }
>
> Actually, I'm not sure this is safe (I don't fully understand codetags,
> so it may be safe). What can happen is that the put_page() can come in
> before the pgalloc_tag_sub(), and then that page can be allocated again.
> Will that cause confusion?
So, there are two reasons I unfortunately can't reuse pgalloc_tag_sub():
1. We need to subtract `bytes` counter from the codetag but not the
`calls` counter, otherwise the final accounting will be incorrect.
This is because we effectively allocated multiple pages with one call
but freeing them with separate calls here. pgalloc_tag_sub_pages()
subtracts bytes but keeps calls counter the same. I mentioned this in
here: https://lore.kernel.org/all/CAJuCfpEgh1OiYNE_uKG-BqW2x97sOL9+AaTX4Jct3=WHzAv+kg@mail.gmail.com/
2. The codetag object itself is stable, it's created at build time.
The exception is when we unload modules and the codetag section gets
freed but during module unloading we check that all module codetags
are not referenced anymore and we prevent unloading this section if
any of them are still referenced (should not normally happen). That
said, the reference to the codetag (in this case from the page_ext)
might change from under us and we have to make sure it's valid. We
ensure that here by getting the codetag itself with pgalloc_tag_get()
*before* calling put_page_testzero(), which ensures its stability.
>
Powered by blists - more mailing lists