lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLCMTu-Ci2yV40zn@fedora>
Date: Thu, 28 Aug 2025 10:05:18 -0700
From: "Vishal Moola (Oracle)" <vishal.moola@...il.com>
To: Yueyang Pan <pyyjason@...il.com>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Vlastimil Babka <vbabka@...e.cz>, Michal Hocko <mhocko@...e.com>,
	Brendan Jackman <jackmanb@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>,
	Usama Arif <usamaarif642@...il.com>, linux-mm@...ck.org,
	kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 0/2] mm/show_mem: Bug fix for print mem alloc info

On Thu, Aug 28, 2025 at 01:29:08AM -0700, Yueyang Pan wrote:
> On Wed, Aug 27, 2025 at 12:51:17PM -0700, Vishal Moola (Oracle) wrote:
> > On Wed, Aug 27, 2025 at 11:34:21AM -0700, Yueyang Pan wrote:
> > > This patch set fixes two issues we saw in production rollout. 
> > > 
> > > The first issue is that we saw all zero output of memory allocation 
> > > profiling information from show_mem() if CONFIG_MEM_ALLOC_PROFILING 
> > > is set and sysctl.vm.mem_profiling=0. In this case, the behaviour 
> > > should be the same as when CONFIG_MEM_ALLOC_PROFILING is unset, 
> > 
> > Did you mean to say when sysctl.vm.mem_profiling=never?
> > 
> > My understanding is that setting the sysctl=0 Pauses memory allocation
> > profiling, while 1 Resumes it. When the sysctl=never should be the same
> > as when the config is unset, but I suspect we might still want the info
> > when set to 0.
> 
> Thanks for your feedback Vishal. Here I mean for both =0 and =never. 
> In both cases, now __show_mem() will print all 0s, which both is redundant 
> and also makes differentiate hard. IMO when __show_mem() prints something 
> the output should be useful at least. 

If differentiating between 0 allocations vs disabled is the primary
concern, I think prefacing the dump with the status of the tool is
better than treating =0 and =never as the same.

The way I see it, the {0,1,never} tristate offers a level of versatility
that I'm not sure we need to eliminate.

I'm thinking about cases where we may temporarily set =1 to track some
allocations, then back to =0 'pause' on that exact period of time. Memory
allocation profiling still has those allocations tracked while set to =0
(we can still see them in /proc/allocinfo at least). If a user decided to
do that just before an oom, could they see something useful from
show_mem() even when =0?

> > 
> > > where show_mem prints nothing about the information. This will make 
> > > further parse easier as we don't have to differentiate what a all 
> > > zero line actually means (Does it mean  0 bytes are allocated 
> > > or simply memory allocation profiling is disabled).
> > > 
> > > The second issue is that multiple entities can call show_mem() 
> > > which messed up the allocation info in dmesg. We saw outputs like this:  
> > > ```
> > >     327 MiB    83635 mm/compaction.c:1880 func:compaction_alloc
> > >    48.4 GiB 12684937 mm/memory.c:1061 func:folio_prealloc
> > >    7.48 GiB    10899 mm/huge_memory.c:1159 func:vma_alloc_anon_folio_pmd
> > >     298 MiB    95216 kernel/fork.c:318 func:alloc_thread_stack_node
> > >     250 MiB    63901 mm/zsmalloc.c:987 func:alloc_zspage
> > >     1.42 GiB   372527 mm/memory.c:1063 func:folio_prealloc
> > >     1.17 GiB    95693 mm/slub.c:2424 func:alloc_slab_page
> > >      651 MiB   166732 mm/readahead.c:270 func:page_cache_ra_unbounded
> > >      419 MiB   107261 net/core/page_pool.c:572 func:__page_pool_alloc_pages_slow
> > >      404 MiB   103425 arch/x86/mm/pgtable.c:25 func:pte_alloc_one
> > > ```
> > > The above example is because one kthread invokes show_mem() 
> > > from __alloc_pages_slowpath while kernel itself calls 
> > > oom_kill_process()
> > 
> > I'm not familiar with show_mem(). Could you spell out what's wrong with
> > the output above?
> 
> So here in the normal case, the output should be sorted by size. Here 
> two print happen at the same time so they interleave with each other, 
> making further parse harder (need to sort again and dedup).

Gotcha.

> > 
> > > Yueyang Pan (2):
> > >   mm/show_mem: No print when not mem_alloc_profiling_enabled()
> > >   mm/show_mem: Add trylock while printing alloc info
> > > 
> > >  mm/show_mem.c | 5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > -- 
> > > 2.47.3
> > > 
> 
> Thanks,
> Pan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ