linux-kernel - Re: [PATCH 1/1] mm: vmstat: Add OOM kill count in vmstat counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151008141851.GD426@dhcp22.suse.cz>
Date:	Thu, 8 Oct 2015 16:18:51 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	PINTU KUMAR <pintu.k@...sung.com>
Cc:	akpm@...ux-foundation.org, minchan@...nel.org, dave@...olabs.net,
	koct9i@...il.com, rientjes@...gle.com, hannes@...xchg.org,
	penguin-kernel@...ove.sakura.ne.jp, bywxiaobai@....com,
	mgorman@...e.de, vbabka@...e.cz, js1304@...il.com,
	kirill.shutemov@...ux.intel.com, alexander.h.duyck@...hat.com,
	sasha.levin@...cle.com, cl@...ux.com, fengguang.wu@...el.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, cpgs@...sung.com,
	pintu_agarwal@...oo.com, pintu.ping@...il.com,
	vishnu.ps@...sung.com, rohit.kr@...sung.com,
	c.rajkumar@...sung.com, sreenathd@...sung.com
Subject: Re: [PATCH 1/1] mm: vmstat: Add OOM kill count in vmstat counter

On Wed 07-10-15 20:18:16, PINTU KUMAR wrote:
[...]
> Ok, let me explain the real case that we have experienced.
> In our case, we have low memory killer in user space itself that invoked based
> on some memory threshold.
> Something like, below 100MB threshold starting killing until it comes back to
> 150MB.
> During our long duration ageing test (more than 72 hours) we observed that many
> applications are killed.
> Now, we were not sure if killing happens in user space or kernel space.
> When we saw the kernel logs, it generated many logs such as;
> /var/log/{messages, messages.0, messages.1, messages.2, messages.3, etc.}
> But, none of the logs contains kernel OOM messages. Although there were some LMK
> kill in user space.
> Then in another round of test we keep dumping _dmesg_ output to a file after
> each iteration.
> After 3 days of tests this time we observed that dmesg output dump contains many
> kernel oom messages.

I am confused. So you suspect that the OOM report didn't get to
/var/log/messages while it was in dmesg?

> Now, every time this dumping is not feasible. And instead of counting manually
> in log file, we wanted to know number of oom kills happened during this tests.
> So we decided to add a counter in /proc/vmstat to track the kernel oom_kill, and
> monitor it during our ageing test.
>
> Basically, we wanted to tune our user space LMK killer for different threshold
> values, so that we can completely avoid the kernel oom kill.
> So, just by looking into this counter, we could able to tune the LMK threshold
> values without depending on the kernel log messages.

Wouldn't a trace point suit you better for this particular use case
considering this is a testing environment?
 
> Also, in most of the system /var/log/messages are not present and we just
> depends on kernel dmesg output, which is petty small for longer run.
> Even if we reduce the loglevel to 4, it may not be suitable to capture all logs.

Hmm, I would consider a logless system considerably crippled but I see
your point and I can imagine that especially small devices might try
to save every single B of the storage. Such a system is basically
undebugable IMO but it still might be interesting to see OOM killer
traces.
 
> > What is even more confusing is the mixing of memcg and global oom
> > conditions.  They are really different things. Memcg API will even
> > give you notification about the OOM event.
> > 
> Ok, you are suggesting to divide the oom_kill counter into 2 parts (global &
> memcg) ?
> May be something like:
> nr_oom_victims
> nr_memcg_oom_victims

You do not need the later. Memcg interface already provides you with a
notification API and if a counter is _really_ needed then it should be
per-memcg not a global cumulative number.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/