[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGgVa9X7nXhqOUZWi+p+JGz1QofiXrTJ+BF=DU3m2-64w@mail.gmail.com>
Date: Tue, 16 Sep 2025 22:27:01 +0000
From: Suren Baghdasaryan <surenb@...gle.com>
To: Usama Arif <usamaarif642@...il.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, akpm@...ux-foundation.org, kent.overstreet@...ux.dev,
hannes@...xchg.org, rientjes@...gle.com, roman.gushchin@...ux.dev,
harry.yoo@...cle.com, shakeel.butt@...ux.dev, 00107082@....com,
pyyjason@...il.com, pasha.tatashin@...een.com, souravpanda@...gle.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in
/proc/allocinfo output
On Tue, Sep 16, 2025 at 10:26 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@...il.com> wrote:
> >
> >
> >
> > On 16/09/2025 22:46, Suren Baghdasaryan wrote:
> > > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@...il.com> wrote:
> > >>
> > >>
> > >>
> > >> On 16/09/2025 16:51, Suren Baghdasaryan wrote:
> > >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@...e.cz> wrote:
> > >>>>
> > >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote:
> > >>>>> While rare, memory allocation profiling can contain inaccurate counters
> > >>>>> if slab object extension vector allocation fails. That allocation might
> > >>>>> succeed later but prior to that, slab allocations that would have used
> > >>>>> that object extension vector will not be accounted for. To indicate
> > >>>>> incorrect counters, "accurate:no" marker is appended to the call site
> > >>>>> line in the /proc/allocinfo output.
> > >>>>> Bump up /proc/allocinfo version to reflect the change in the file format
> > >>>>> and update documentation.
> > >>>>>
> > >>>>> Example output with invalid counters:
> > >>>>> allocinfo - version: 2.0
> > >>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
> > >>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
> > >>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
> > >>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
> > >>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
> > >>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
> > >>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
> > >>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
> > >>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
> > >>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
> > >>>>>
> > >>>>> Suggested-by: Johannes Weiner <hannes@...xchg.org>
> > >>>>> Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > >>>>> Acked-by: Shakeel Butt <shakeel.butt@...ux.dev>
> > >>>>> Acked-by: Usama Arif <usamaarif642@...il.com>
> > >>>>> Acked-by: Johannes Weiner <hannes@...xchg.org>
> > >>>>
> > >>>> With this format you could instead print the accumulated size of allocations
> > >>>> that could not allocate their objext (for the given tag). It should be then
> > >>>> an upper bound of the actual error, because obviously we cannot recognize
> > >>>> moments where these allocations are freed - so we don't know for which tag
> > >>>> to decrement. Maybe it could be more useful output than the yes/no
> > >>>> information, although of course require more storage in struct codetag, so I
> > >>>> don't know if it's worth it.
> > >>>
> > >>> Yeah, I'm reluctant to add more fields to the codetag and increase the
> > >>> overhead until we have a usecases. If that happens and with the new
> > >>> format we can add something like error_size:<value> to indicate the
> > >>> amount of the error.
> > >>>
> > >>>>
> > >>>> Maybe a global counter of sum size for all these missed objexts could be
> > >>>> also maintained, and that wouldn't be an upper bound but an actual current
> > >>>> error, that is if we can precisely determine that when freeing an object, we
> > >>>> don't have a tag to decrement because objext allocation had failed on it and
> > >>>> thus that allocation had incremented this global error counter and it's
> > >>>> correct to decrement it.
> > >>>
> > >>> That's a good idea and should be doable without too much overhead. Thanks!
> > >>> For the UAPI... I think for this case IOCTL would work and the use
> > >>> scenario would be that the user sees the "accurate:no" mark and issues
> > >>> ioctl command to retrieve this global counter value.
> > >>> Usama, since you initiated this feature request, do you think such a
> > >>> counter would be useful?
> > >>>
> > >>
> > >>
> > >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers,
> > >> but it might be better to put it there?
> > >> If the value is in the file, I imagine people will be more prone to looking at it?
> > >> I am not completely sure if everyone will do an ioctl to try and find this out?
> > >> Especially if you just have infra that is just automatically collecting info from
> > >> this file.
> > >
> > > The current file reports per-codetag data and not global counters. We
> > > could report it somewhere in the header but the first question to
> > > answer is: would this be really useful (not in a way of "nice to
> > > have" but for a concrete usecase)? If not then I would suggest keeping
> > > things simple until there is a need for it.
> > >
> >
> > I think its a nice to have. I can't think of a concrete usecase at present.
> >
> > I guess a potential usecase is if you are trying to use memory allocation
> > profiling to debug OOMs and the missed objects size is very large. I guess we
> > wont know until this happens, but I would hope this number is usually small.
>
> Hmm. Missing a large allocation and not knowing about it can be a problem...
> I'll start sketching a patch to see if tracking such a global counter
> has any drawbacks and in the meantime I'm open to suggestions on how
> to expose it to the userspace.
>
> About concerns on the IOCTL interface, would it be more usable if we
> get the alloctop [1] or a similar tool which can be used to easily
> issue such commands into kernel/tools?
>
> [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860
Ugh, sorry. Externally accesible link would be
https://android-review.googlesource.com/c/platform/system/memory/libmeminfo/+/3431860
>
> >
Powered by blists - more mailing lists