[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <613698f0.a994.19939d88e1c.Coremail.00107082@163.com>
Date: Fri, 12 Sep 2025 01:35:17 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Yueyang Pan" <pyyjason@...il.com>,
"Suren Baghdasaryan" <surenb@...gle.com>
Cc: "Usama Arif" <usamaarif642@...il.com>, akpm@...ux-foundation.org,
kent.overstreet@...ux.dev, vbabka@...e.cz, hannes@...xchg.org,
rientjes@...gle.com, roman.gushchin@...ux.dev, harry.yoo@...cle.com,
shakeel.butt@...ux.dev, pasha.tatashin@...een.com,
souravpanda@...gle.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] alloc_tag: mark inaccurate allocation counters in
/proc/allocinfo output
At 2025-09-12 01:25:05, "Yueyang Pan" <pyyjason@...il.com> wrote:
>On Thu, Sep 11, 2025 at 09:18:29AM -0700, Suren Baghdasaryan wrote:
>> On Thu, Sep 11, 2025 at 9:00 AM Usama Arif <usamaarif642@...il.com> wrote:
>> >
>> >
>> >
>> > On 11/09/2025 16:47, Yueyang Pan wrote:
>> > > On Thu, Sep 11, 2025 at 11:03:50PM +0800, David Wang wrote:
>> > >>
>> > >> At 2025-09-10 07:49:42, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
>> > >>> While rare, memory allocation profiling can contain inaccurate counters
>> > >>> if slab object extension vector allocation fails. That allocation might
>> > >>> succeed later but prior to that, slab allocations that would have used
>> > >>> that object extension vector will not be accounted for. To indicate
>> > >>> incorrect counters, mark them with an asterisk in the /proc/allocinfo
>> > >>> output.
>> > >>> Bump up /proc/allocinfo version to reflect change in the file format.
>> > >>>
>> > >>> Example output with invalid counters:
>> > >>> allocinfo - version: 2.0
>> > >>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
>> > >>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
>> > >>> 0* 0* arch/x86/kernel/alternative.c:127 func:__its_alloc
>> > >>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
>> > >>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
>> > >>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
>> > >>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
>> > >>> 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
>> > >>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
>> > >>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
>> > >>>
>> > >>
>> > >> Hi,
>> > >> The changes may break some client tools, mine included....
>> > >> I don't mind adjusting my tools, but still
>> > >> Is it acceptable to change
>> > >> 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
>> > >> to
>> > >> +49152 +48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create*
>> > >>
>> > >> The '+' sign make it still standout when view from a terminal, and client tools, not all of them though, might not need any changes.
>> > >> And when client want to filter out inaccurate data items, it could be done by checking the tailing '*" of func name.
>> > >
>> > > I agree with David on this point. We already have monitoring tool built on top
>> > > of this output across meta fleet. Ideally we would like to keep the format of
>> > > of size and calls the same, even for future version, because adding a * will
>> > > change the format from int to str, which leads to change over the regex parser
>> > > many places.
>> > >
>> > > I think simply adding * to the end of function name or filename is sufficient
>> > > as they are already str.
>> > >
>> >
>> > Instead of:
>> >
>> > 49152* 48* arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create
>> >
>> > Could we do something like:
>> >
>> > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create(inaccurate)
>>
>> If there is a postprocessing then this would break sometimes later
>> when the function name is parsed, right? So IMO that just postpones
>> the breakage.
>>
>> >
>> > This should hopefully not require any changes to the tools that are consuming this file.
>> > I think it might be better to use "(inaccurate)" (without any space after function name) or
>> > some other text instead of "+" or "*" to prevent breaking such tools. I dont think we need
>> > to even increment allocinfo version number as well then?
>>
>> I'm wondering if we add a new column at the end like this:
>>
>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709
>> func:mce_device_create [inaccurate]
>>
>> would that break the parsing tools?
>> Well-designed parsers usually throw away additional fields which they
>> don't know how to parse. WDYT?
>>
>
>It would break the parse now as we count the number of string to decide if
>there is an optional module name or not. I don't think it is a big
>deal to fix though.
The inconsistent of module name is really inconvenient for parsing.....
Could we make changes to make it consistent, something like:
diff --git a/lib/codetag.c b/lib/codetag.c
index 545911cebd25..b8a4595adc95 100644
--- a/lib/codetag.c
+++ b/lib/codetag.c
@@ -124,7 +124,7 @@ void codetag_to_text(struct seq_buf *out, struct codetag *ct)
ct->filename, ct->lineno,
ct->modname, ct->function);
else
- seq_buf_printf(out, "%s:%u func:%s",
+ seq_buf_printf(out, "%s:%u [kernel] func:%s",
ct->filename, ct->lineno, ct->function);
}
>
>I think one more important thing is probably to reach a consensus on
>what format can be changed in the future, for example say, we can
>keep adding columns but not change the format the type of one column.
>With such consensus in mind, it will be easier to design the parser.
>And I guess many companies will build parser upon this info for fleet-
>wise collection.
>
>> >
>> > >>
>> > >> (There would be some corner cases, for example, the '+' sign may not needed when the value reach a negative value if some underflow bug happened)
>> > >>
>> > >>
>> > >> Thanks
>> > >> David.
>> > >>
>> > >>
>> > >>> Suggested-by: Johannes Weiner <hannes@...xchg.org>
>> > >>> Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
>> > >>> ---
>> > >>
>> > >
>> > > Thanks
>> > > Pan
>> >
>
>Thanks
>Pan
Powered by blists - more mailing lists