[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLCMTu-Ci2yV40zn@fedora>
Date: Thu, 28 Aug 2025 10:05:18 -0700
From: "Vishal Moola (Oracle)" <vishal.moola@...il.com>
To: Yueyang Pan <pyyjason@...il.com>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>, Michal Hocko <mhocko@...e.com>,
Brendan Jackman <jackmanb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>,
Usama Arif <usamaarif642@...il.com>, linux-mm@...ck.org,
kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 0/2] mm/show_mem: Bug fix for print mem alloc info
On Thu, Aug 28, 2025 at 01:29:08AM -0700, Yueyang Pan wrote:
> On Wed, Aug 27, 2025 at 12:51:17PM -0700, Vishal Moola (Oracle) wrote:
> > On Wed, Aug 27, 2025 at 11:34:21AM -0700, Yueyang Pan wrote:
> > > This patch set fixes two issues we saw in production rollout.
> > >
> > > The first issue is that we saw all zero output of memory allocation
> > > profiling information from show_mem() if CONFIG_MEM_ALLOC_PROFILING
> > > is set and sysctl.vm.mem_profiling=0. In this case, the behaviour
> > > should be the same as when CONFIG_MEM_ALLOC_PROFILING is unset,
> >
> > Did you mean to say when sysctl.vm.mem_profiling=never?
> >
> > My understanding is that setting the sysctl=0 Pauses memory allocation
> > profiling, while 1 Resumes it. When the sysctl=never should be the same
> > as when the config is unset, but I suspect we might still want the info
> > when set to 0.
>
> Thanks for your feedback Vishal. Here I mean for both =0 and =never.
> In both cases, now __show_mem() will print all 0s, which both is redundant
> and also makes differentiate hard. IMO when __show_mem() prints something
> the output should be useful at least.
If differentiating between 0 allocations vs disabled is the primary
concern, I think prefacing the dump with the status of the tool is
better than treating =0 and =never as the same.
The way I see it, the {0,1,never} tristate offers a level of versatility
that I'm not sure we need to eliminate.
I'm thinking about cases where we may temporarily set =1 to track some
allocations, then back to =0 'pause' on that exact period of time. Memory
allocation profiling still has those allocations tracked while set to =0
(we can still see them in /proc/allocinfo at least). If a user decided to
do that just before an oom, could they see something useful from
show_mem() even when =0?
> >
> > > where show_mem prints nothing about the information. This will make
> > > further parse easier as we don't have to differentiate what a all
> > > zero line actually means (Does it mean 0 bytes are allocated
> > > or simply memory allocation profiling is disabled).
> > >
> > > The second issue is that multiple entities can call show_mem()
> > > which messed up the allocation info in dmesg. We saw outputs like this:
> > > ```
> > > 327 MiB 83635 mm/compaction.c:1880 func:compaction_alloc
> > > 48.4 GiB 12684937 mm/memory.c:1061 func:folio_prealloc
> > > 7.48 GiB 10899 mm/huge_memory.c:1159 func:vma_alloc_anon_folio_pmd
> > > 298 MiB 95216 kernel/fork.c:318 func:alloc_thread_stack_node
> > > 250 MiB 63901 mm/zsmalloc.c:987 func:alloc_zspage
> > > 1.42 GiB 372527 mm/memory.c:1063 func:folio_prealloc
> > > 1.17 GiB 95693 mm/slub.c:2424 func:alloc_slab_page
> > > 651 MiB 166732 mm/readahead.c:270 func:page_cache_ra_unbounded
> > > 419 MiB 107261 net/core/page_pool.c:572 func:__page_pool_alloc_pages_slow
> > > 404 MiB 103425 arch/x86/mm/pgtable.c:25 func:pte_alloc_one
> > > ```
> > > The above example is because one kthread invokes show_mem()
> > > from __alloc_pages_slowpath while kernel itself calls
> > > oom_kill_process()
> >
> > I'm not familiar with show_mem(). Could you spell out what's wrong with
> > the output above?
>
> So here in the normal case, the output should be sorted by size. Here
> two print happen at the same time so they interleave with each other,
> making further parse harder (need to sort again and dedup).
Gotcha.
> >
> > > Yueyang Pan (2):
> > > mm/show_mem: No print when not mem_alloc_profiling_enabled()
> > > mm/show_mem: Add trylock while printing alloc info
> > >
> > > mm/show_mem.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > --
> > > 2.47.3
> > >
>
> Thanks,
> Pan
Powered by blists - more mailing lists