[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJuCfpHZhMwK8jOz_evvvD8CaNxxaaRQEx0Qv_yPp4ZA_DkXeg@mail.gmail.com>
Date: Tue, 26 Nov 2024 09:10:23 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: David Wang <00107082@....com>
Cc: kent.overstreet@...ux.dev, linux-kernel@...r.kernel.org, yuzhao@...gle.com
Subject: Re: Abnormal values show up in /proc/allocinfo
On Mon, Nov 25, 2024 at 11:17 PM David Wang <00107082@....com> wrote:
>
> Hi,
>
>
> At 2024-11-26 09:14:50, "David Wang" <00107082@....com> wrote:
> >
> >Hi,
> >
> >
> >
> >At 2024-11-26 04:31:39, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
> >>
> >>Hi David,
> >>Could you please check if you have this fix:
> >>
> >>ed265529d39a "mm/codetag: fix arg in pgalloc_tag_copy alloc_tag_sub"
> >>
> >>It was merged after v6.12-rc6 and it fixes an accounting bug inside
> >>pgalloc_tag_copy(), which is used during compaction.
> >>Thanks,
> >>Suren.
> >>
> >>
> >>https://lore.kernel.org/all/20240906042108.1150526-3-yuzhao@google.com/
> >>https://lore.kernel.org/all/20241022232440.334820-1-souravpanda@google.com/
> >
> >
> >No, ed265529d39a is not in 6.12.0; It is now in Linus' tree.
> >I will pull the code and make some tests.
> >
> >Will update later.
>
>
> I build a kernel based on a tree with top commit 9f16d5e6f220,
> no abnormal value observed for compaction_alloc:
>
> $ sudo cat /proc/allocinfo | grep compaction_alloc
> 0 0 mm/compaction.c:1880 func:compaction_alloc 250098
> (The last column is accumulative call counters I patched with my system, meaning compaction_alloc do happen)
>
>
> But, still got underflowed values:
>
> -4096 18446744073709551615 mm/filemap.c:3788 func:do_read_cache_folio 18
> -86241280 18446744073709530561 mm/filemap.c:1951 func:__filemap_get_folio 8691015
>
> Finally a procedure to reproduce it: (thanks for the tip about compact_memory)
> 1. populate file caches, e.g. grep something in kernel source.
> 2. echo 1 >/proc/sys/vm/compact_memory
> 3. echo 3 > /proc/sys/vm/drop_caches
>
> There would be negative values show up.
> A simple python script to check abnormal values:
>
> with open("/proc/allocinfo") as f:
> for l in f:
> try:
> vs = l.split()
> v1, v2 = int(vs[0]), int(vs[1])
> if v1<0 or v2<0 or (v1==0 and v2!=0) or (v1!=0 and v2==0): print l,
> except: pass
>
>
> Most likely, memory release is accounted *twice* for those memory.
> Reading through the code, I feel the most suspected code would be:
> clear_page_tag_ref(&old->page);
> This line of code may not work as expected.
>
> The code in pgalloc_tag_copy involves too many low level plumbing details, I think it
> is simpler to just implement a *swap* logic: just swap the tags.
>
> I made following changes and it works, no abnormal values detected, so far.
> (pgalloc_tag_copy should be renamed to pgalloc_tag_swap)
>
>
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index 2414a7ee7ec7..7d6d1015f4b1 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -191,24 +192,30 @@ void pgalloc_tag_split(struct folio *folio, int old_order, int new_order)
>
> void pgalloc_tag_copy(struct folio *new, struct folio *old)
> {
> - union pgtag_ref_handle handle;
> - union codetag_ref ref;
> - struct alloc_tag *tag;
> + union pgtag_ref_handle handle_new, handle_old;
> + union codetag_ref ref_new, ref_old;
> + struct alloc_tag *tag_new, *tag_old;
>
> - tag = pgalloc_tag_get(&old->page);
> - if (!tag)
> + tag_old = pgalloc_tag_get(&old->page);
> + if (!tag_old)
> + return;
> + tag_new = pgalloc_tag_get(&new->page);
> + if (!tag_new)
> return;
>
> - if (!get_page_tag_ref(&new->page, &ref, &handle))
> + if (!get_page_tag_ref(&old->page, &ref_old, &handle_old))
> + return;
> + if (!get_page_tag_ref(&new->page, &ref_new, &handle_new))
> return;
>
> - /* Clear the old ref to the original allocation tag. */
> - clear_page_tag_ref(&old->page);
> - /* Decrement the counters of the tag on get_new_folio. */
> - alloc_tag_sub(&ref, folio_size(new));
> - __alloc_tag_ref_set(&ref, tag);
> - update_page_tag_ref(handle, &ref);
> - put_page_tag_ref(handle);
> + /* swap tag */
> + __alloc_tag_ref_set(&ref_new, tag_old);
> + update_page_tag_ref(handle_new, &ref_new);
> + put_page_tag_ref(handle_new);
> +
> + __alloc_tag_ref_set(&ref_old, tag_new);
> + update_page_tag_ref(handle_old, &ref_old);
> + put_page_tag_ref(handle_old);
> }
Hi David,
Thanks for the investigation. I think your suggestion should work fine
and it's simpler than what we do now. It will swap not only counters
but allocation locations as well, however I think we already do that
when we call __alloc_tag_ref_set(). So, instead of clearing the
original tag, decrementing the new tag's counter (to compensate for
its own allocation) and reassigning the old tag to the new counter,
you simply swap the tags. That seems fine to me.
However I think there is still a bug where some get_new_folio()
callback does not increment the new folio's counters and that's why we
get an underflow when calling alloc_tag_sub(). I'll try to reproduce
on my side and see what's going on there.
Thanks,
Suren.
>
> static void shutdown_mem_profiling(bool remove_file)
>
>
>
> FYI
> David
>
>
> >
> >Thanks~
> >David
> >
> >>
> >>> > >Thanks,
> >>> > >Suren.
> >>> > >
> >>> > >>
> >>> > >>
> >>> > >> Thanks
> >>> > >> David
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
Powered by blists - more mailing lists