[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGezf06eR7WnzizpwTaxZ5Rm8jbeW4y87zcr6LZuJ9MZA@mail.gmail.com>
Date: Thu, 11 Sep 2025 09:18:29 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Usama Arif <usamaarif642@...il.com>
Cc: Yueyang Pan <pyyjason@...il.com>, David Wang <00107082@....com>, akpm@...ux-foundation.org,
kent.overstreet@...ux.dev, vbabka@...e.cz, hannes@...xchg.org,
rientjes@...gle.com, roman.gushchin@...ux.dev, harry.yoo@...cle.com,
shakeel.butt@...ux.dev, pasha.tatashin@...een.com, souravpanda@...gle.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] alloc_tag: mark inaccurate allocation counters in
/proc/allocinfo output
On Thu, Sep 11, 2025 at 9:00 AM Usama Arif <usamaarif642@...il.com> wrote:
>
>
>
> On 11/09/2025 16:47, Yueyang Pan wrote:
> > On Thu, Sep 11, 2025 at 11:03:50PM +0800, David Wang wrote:
> >>
> >> At 2025-09-10 07:49:42, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
> >>> While rare, memory allocation profiling can contain inaccurate counters
> >>> if slab object extension vector allocation fails. That allocation might
> >>> succeed later but prior to that, slab allocations that would have used
> >>> that object extension vector will not be accounted for. To indicate
> >>> incorrect counters, mark them with an asterisk in the /proc/allocinfo
> >>> output.
> >>> Bump up /proc/allocinfo version to reflect change in the file format.
> >>>
> >>> Example output with invalid counters:
> >>> allocinfo - version: 2.0
> >>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
> >>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
> >>> 0* 0* arch/x86/kernel/alternative.c:127 func:__its_alloc
> >>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
> >>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
> >>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
> >>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
> >>> 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
> >>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
> >>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
> >>>
> >>
> >> Hi,
> >> The changes may break some client tools, mine included....
> >> I don't mind adjusting my tools, but still
> >> Is it acceptable to change
> >> 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
> >> to
> >> +49152 +48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create*
> >>
> >> The '+' sign make it still standout when view from a terminal, and client tools, not all of them though, might not need any changes.
> >> And when client want to filter out inaccurate data items, it could be done by checking the tailing '*" of func name.
> >
> > I agree with David on this point. We already have monitoring tool built on top
> > of this output across meta fleet. Ideally we would like to keep the format of
> > of size and calls the same, even for future version, because adding a * will
> > change the format from int to str, which leads to change over the regex parser
> > many places.
> >
> > I think simply adding * to the end of function name or filename is sufficient
> > as they are already str.
> >
>
> Instead of:
>
> 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
>
> Could we do something like:
>
> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create(inaccurate)
If there is a postprocessing then this would break sometimes later
when the function name is parsed, right? So IMO that just postpones
the breakage.
>
> This should hopefully not require any changes to the tools that are consuming this file.
> I think it might be better to use "(inaccurate)" (without any space after function name) or
> some other text instead of "+" or "*" to prevent breaking such tools. I dont think we need
> to even increment allocinfo version number as well then?
I'm wondering if we add a new column at the end like this:
49152 48 arch/x86/kernel/cpu/mce/core.c:2709
func:mce_device_create [inaccurate]
would that break the parsing tools?
Well-designed parsers usually throw away additional fields which they
don't know how to parse. WDYT?
>
> >>
> >> (There would be some corner cases, for example, the '+' sign may not needed when the value reach a negative value if some underflow bug happened)
> >>
> >>
> >> Thanks
> >> David.
> >>
> >>
> >>> Suggested-by: Johannes Weiner <hannes@...xchg.org>
> >>> Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> >>> ---
> >>
> >
> > Thanks
> > Pan
>
Powered by blists - more mailing lists