linux-kernel - Re: [PATCH 2/7] mm: shrinker: Add a .to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZWcBDglmDKUJdwMv@tiehlicka>
Date:   Wed, 29 Nov 2023 10:14:54 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Roman Gushchin <roman.gushchin@...ux.dev>
Cc:     Qi Zheng <zhengqi.arch@...edance.com>,
        Kent Overstreet <kent.overstreet@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Linux-MM <linux-mm@...ck.org>, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Chinner <david@...morbit.com>
Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers

On Tue 28-11-23 16:34:35, Roman Gushchin wrote:
> On Tue, Nov 28, 2023 at 02:23:36PM +0800, Qi Zheng wrote:
[...]
> > Now I think adding this method might not be a good idea. If we allow
> > shrinkers to report thier own private information, OOM logs may become
> > cluttered. Most people only care about some general information when
> > troubleshooting OOM problem, but not the private information of a
> > shrinker.
> 
> I agree with that.
> 
> It seems that the feature is mostly useful for kernel developers and it's easily
> achievable by attaching a bpf program to the oom handler. If it requires a bit
> of work on the bpf side, we can do that instead, but probably not. And this
> solution can potentially provide way more information in a more flexible way.
> 
> So I'm not convinced it's a good idea to make the generic oom handling code
> more complicated and fragile for everybody, as well as making oom reports differ
> more between kernel versions and configurations.

Completely agreed! From my many years of experience of oom reports
analysing from production systems I would conclude the following categories
	- clear runaways (and/or memory leaks)
		- userspace consumers - either shmem or anonymous memory
		  predominantly consumes the memory, swap is either depleted
		  or not configured.
		  OOM report is usually useful to pinpoint those as we
		  have required counters available
		- kernel memory consumers - if we are lucky they are
		  using slab allocator and unreclaimable slab is a huge
		  part of the memory consumption. If this is a page
		  allocator user the oom repport only helps to deduce
		  the fact by looking at how much user + slab + page
		  table etc. form. But identifying the root cause is
		  close to impossible without something like page_owner
		  or a crash dump.
	- misbehaving memory reclaim
		- minority of issues and the oom report is usually
		  insufficient to drill down to the root cause. If the
		  problem is reproducible then collecting vmstat data
		  can give a much better clue.
		- high number of slab reclaimable objects or free swap
		  are good indicators. Shrinkers data could be
		  potentially helpful in the slab case but I really have
		  hard time to remember any such situation.
On non-production systems the situation is quite different. I can see
how it could be very beneficial to add a very specific debugging data
for subsystem/shrinker which is developed and could cause the OOM. For
that purpose the proposed scheme is rather inflexible AFAICS.

-- 
Michal Hocko
SUSE Labs