lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aK2/Vesgr9Xcl5gy@devbig569.cln6.facebook.com>
Date: Tue, 26 Aug 2025 07:06:13 -0700
From: Yueyang Pan <pyyjason@...il.com>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
	Kent Overstreet <kent.overstreet@...ux.dev>,
	Usama Arif <usamaarif642@...il.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC 0/1] Try to add memory allocation info for cgroup oom kill

On Thu, Aug 21, 2025 at 12:53:03PM -0700, Shakeel Butt wrote:
> On Thu, Aug 21, 2025 at 12:18:00PM -0700, Yueyang Pan wrote:
> > On Thu, Aug 21, 2025 at 11:35:19AM -0700, Shakeel Butt wrote:
> > > On Thu, Aug 14, 2025 at 10:11:56AM -0700, Yueyang Pan wrote:
> > > > Right now in the oom_kill_process if the oom is because of the cgroup 
> > > > limit, we won't get memory allocation infomation. In some cases, we 
> > > > can have a large cgroup workload running which dominates the machine. 
> > > > The reason using cgroup is to leave some resource for system. When this 
> > > > cgroup is killed, we would also like to have some memory allocation 
> > > > information for the whole server as well. This is reason behind this 
> > > > mini change. Is it an acceptable thing to do? Will it be too much 
> > > > information for people? I am happy with any suggestions!
> > > 
> > > For a single patch, it is better to have all the context in the patch
> > > and there is no need for cover letter.
> > 
> > Thanks for your suggestion Shakeel! I will change this in the next version.
> > 
> > > 
> > > What exact information you want on the memcg oom that will be helpful
> > > for the users in general? You mentioned memory allocation information,
> > > can you please elaborate a bit more.
> > > 
> > 
> > As in my reply to Suren, I was thinking the system-wide memory usage info 
> > provided by show_free_pages and memory allocation profiling info can help 
> > us debug cgoom by comparing them with historical data. What is your take on 
> > this?
> > 
> 
> I am not really sure about show_free_areas(). More specifically how the
> historical data diff will be useful for a memcg oom. If you have a
> concrete example, please give one. For memory allocation profiling, is

Sorry for my late reply. I have been trying hard to think about a use case. 
One specific case I can think about is when there is no workload stacking, 
when one job is running solely on the machine. For example, memory allocation 
profiling can tell the memory usage of the network driver, which can make 
cg allocates memory harder and eventually leads to cgoom. Without this 
information, it would be hard to reason about what is happening in the kernel 
given increased oom number.

show_free_areas() will give a summary of different types of memory which 
can possibably lead to increased cgoom in my previous case. Then one looks 
deeper via the memory allocation profiling as an entrypoint to debug.

Does this make sense to you?

> it possible to filter for the given memcg? Do we save memcg information
> in the memory allocation profiling?

Thanks
Pan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ