lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpEULVxMixDjrk_xg7+3+97dkcMmkDd++BaR17X4tDSs6Q@mail.gmail.com>
Date: Tue, 16 Sep 2025 22:26:10 +0000
From: Suren Baghdasaryan <surenb@...gle.com>
To: Usama Arif <usamaarif642@...il.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, akpm@...ux-foundation.org, kent.overstreet@...ux.dev, 
	hannes@...xchg.org, rientjes@...gle.com, roman.gushchin@...ux.dev, 
	harry.yoo@...cle.com, shakeel.butt@...ux.dev, 00107082@....com, 
	pyyjason@...il.com, pasha.tatashin@...een.com, souravpanda@...gle.com, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in
 /proc/allocinfo output

On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@...il.com> wrote:
>
>
>
> On 16/09/2025 22:46, Suren Baghdasaryan wrote:
> > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@...il.com> wrote:
> >>
> >>
> >>
> >> On 16/09/2025 16:51, Suren Baghdasaryan wrote:
> >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@...e.cz> wrote:
> >>>>
> >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote:
> >>>>> While rare, memory allocation profiling can contain inaccurate counters
> >>>>> if slab object extension vector allocation fails. That allocation might
> >>>>> succeed later but prior to that, slab allocations that would have used
> >>>>> that object extension vector will not be accounted for. To indicate
> >>>>> incorrect counters, "accurate:no" marker is appended to the call site
> >>>>> line in the /proc/allocinfo output.
> >>>>> Bump up /proc/allocinfo version to reflect the change in the file format
> >>>>> and update documentation.
> >>>>>
> >>>>> Example output with invalid counters:
> >>>>> allocinfo - version: 2.0
> >>>>>            0        0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
> >>>>>            0        0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
> >>>>>            0        0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
> >>>>>            0        0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
> >>>>>            0        0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
> >>>>>            0        0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
> >>>>>            0        0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
> >>>>>        49152       48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
> >>>>>        32768        1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
> >>>>>            0        0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
> >>>>>
> >>>>> Suggested-by: Johannes Weiner <hannes@...xchg.org>
> >>>>> Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> >>>>> Acked-by: Shakeel Butt <shakeel.butt@...ux.dev>
> >>>>> Acked-by: Usama Arif <usamaarif642@...il.com>
> >>>>> Acked-by: Johannes Weiner <hannes@...xchg.org>
> >>>>
> >>>> With this format you could instead print the accumulated size of allocations
> >>>> that could not allocate their objext (for the given tag). It should be then
> >>>> an upper bound of the actual error, because obviously we cannot recognize
> >>>> moments where these allocations are freed - so we don't know for which tag
> >>>> to decrement. Maybe it could be more useful output than the yes/no
> >>>> information, although of course require more storage in struct codetag, so I
> >>>> don't know if it's worth it.
> >>>
> >>> Yeah, I'm reluctant to add more fields to the codetag and increase the
> >>> overhead until we have a usecases. If that happens and with the new
> >>> format we can add something like error_size:<value> to indicate the
> >>> amount of the error.
> >>>
> >>>>
> >>>> Maybe a global counter of sum size for all these missed objexts could be
> >>>> also maintained, and that wouldn't be an upper bound but an actual current
> >>>> error, that is if we can precisely determine that when freeing an object, we
> >>>> don't have a tag to decrement because objext allocation had failed on it and
> >>>> thus that allocation had incremented this global error counter and it's
> >>>> correct to decrement it.
> >>>
> >>> That's a good idea and should be doable without too much overhead. Thanks!
> >>> For the UAPI... I think for this case IOCTL would work and the use
> >>> scenario would be that the user sees the "accurate:no" mark and issues
> >>> ioctl command to retrieve this global counter value.
> >>> Usama, since you initiated this feature request, do you think such a
> >>> counter would be useful?
> >>>
> >>
> >>
> >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers,
> >> but it might be better to put it there?
> >> If the value is in the file, I imagine people will be more prone to looking at it?
> >> I am not completely sure if everyone will do an ioctl to try and find this out?
> >> Especially if you just have infra that is just automatically collecting info from
> >> this file.
> >
> > The current file reports per-codetag data and not global counters. We
> > could report it somewhere in the header but the first question to
> > answer is: would this be really useful (not in a way of  "nice to
> > have" but for a concrete usecase)? If not then I would suggest keeping
> > things simple until there is a need for it.
> >
>
> I think its a nice to have. I can't think of a concrete usecase at present.
>
> I guess a potential usecase is if you are trying to use memory allocation
> profiling to debug OOMs and the missed objects size is very large. I guess we
> wont know until this happens, but I would hope this number is usually small.

Hmm. Missing a large allocation and not knowing about it can be a problem...
I'll start sketching a patch to see if tracking such a global counter
has any drawbacks and in the meantime I'm open to suggestions on how
to expose it to the userspace.

About concerns on the IOCTL interface, would it be more usable if we
get the alloctop [1] or a similar tool which can be used to easily
issue such commands into kernel/tools?

[1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860

>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ