linux-kernel - Re: [PATCH 1/1] mm: vmstat: Add OOM kill count in vmstat counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151008163049.GJ426@dhcp22.suse.cz>
Date:	Thu, 8 Oct 2015 18:30:50 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	PINTU KUMAR <pintu.k@...sung.com>
Cc:	akpm@...ux-foundation.org, minchan@...nel.org, dave@...olabs.net,
	koct9i@...il.com, rientjes@...gle.com, hannes@...xchg.org,
	penguin-kernel@...ove.sakura.ne.jp, bywxiaobai@....com,
	mgorman@...e.de, vbabka@...e.cz, js1304@...il.com,
	kirill.shutemov@...ux.intel.com, alexander.h.duyck@...hat.com,
	sasha.levin@...cle.com, cl@...ux.com, fengguang.wu@...el.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, cpgs@...sung.com,
	pintu_agarwal@...oo.com, pintu.ping@...il.com,
	vishnu.ps@...sung.com, rohit.kr@...sung.com,
	c.rajkumar@...sung.com, sreenathd@...sung.com
Subject: Re: [PATCH 1/1] mm: vmstat: Add OOM kill count in vmstat counter

On Thu 08-10-15 21:36:24, PINTU KUMAR wrote:
[...]
> Whereas, these OOM logs were not found in /var/log/messages.
> May be we do heavy logging because in ageing test we enable maximum
> functionality (Wifi, BT, GPS, fully loaded system).

If you swamp your logs so heavily that even critical messages won't make
it into the log files then your logging is basically useless for
anything serious. But that is not really that important.

> Hope, it is clear now. If not, please ask me for more information.
> 
> > 
> > > Now, every time this dumping is not feasible. And instead of counting
> > > manually in log file, we wanted to know number of oom kills happened during
> > this tests.
> > > So we decided to add a counter in /proc/vmstat to track the kernel
> > > oom_kill, and monitor it during our ageing test.
> > >
> > > Basically, we wanted to tune our user space LMK killer for different
> > > threshold values, so that we can completely avoid the kernel oom kill.
> > > So, just by looking into this counter, we could able to tune the LMK
> > > threshold values without depending on the kernel log messages.
> > 
> > Wouldn't a trace point suit you better for this particular use case
> > considering this
> > is a testing environment?
> > 
> Tracing for oom_kill count?
> Actually, tracing related configs will be normally disabled in release binary.

Yes but your use case described a testing environment.

> And it is not always feasible to perform tracing for such long duration tests.

I do not see why long duration would be a problem. Each tracepoint can
be enabled separatelly.

> Then it should be valid for other counters as well.
> 
> > > Also, in most of the system /var/log/messages are not present and we
> > > just depends on kernel dmesg output, which is petty small for longer run.
> > > Even if we reduce the loglevel to 4, it may not be suitable to capture all
> logs.
> > 
> > Hmm, I would consider a logless system considerably crippled but I see your
> > point and I can imagine that especially small devices might try to save every
> > single B of the storage. Such a system is basically undebugable IMO but it
> still
> > might be interesting to see OOM killer traces.
> > 
> Exactly, some of the small embedded systems might be having 512MB, 256MB, 128MB,
> or even lesser.
> Also, the storage space will be 8GB or below.
> In such a system we cannot afford heavy log files and exact tuning and stability
> is most important.

And that is what log level is for. If your logs are heavy with error
levels then you are far from being production ready... ;)

> Even all tracing / profiling configs will be disabled to lowest level for
> reducing kernel code size as well.

What level is that? crit? Is err really that noisy?
 
[...]
> > > Ok, you are suggesting to divide the oom_kill counter into 2 parts
> > > (global &
> > > memcg) ?
> > > May be something like:
> > > nr_oom_victims
> > > nr_memcg_oom_victims
> > 
> > You do not need the later. Memcg interface already provides you with a
> > notification API and if a counter is _really_ needed then it should be
> > per-memcg
> > not a global cumulative number.
> 
> Ok, for memory cgroups, you mean to say this one?
> sh-3.2# cat /sys/fs/cgroup/memory/memory.oom_control
> oom_kill_disable 0
> under_oom 0

Yes this is the notification API.

> I am actually confused here what to do next?
> Shall I push a new patch set with just:
> nr_oom_victims counter ?

Yes you can repost with a better description about a typical usage
scenarios. I cannot say I would be completely sold to this because
the only relevant usecase I've heard so far is the logless system
which is pretty much a corner case. This is not a reason to nack it
though. It is definitely better than the original oom_stall suggestion
because it has a clear semantic at least.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/