lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84FF21A720B0874AA94B46D76DB9826904556CB7@008-AM1MPN1-003.mgdnok.nokia.com>
Date:	Thu, 12 Jan 2012 08:32:16 +0000
From:	<leonid.moiseichuk@...ia.com>
To:	<rientjes@...gle.com>
CC:	<gregkh@...e.de>, <linux-mm@...ck.org>,
	<linux-kernel@...r.kernel.org>, <cesarb@...arb.net>,
	<kamezawa.hiroyu@...fujitsu.com>, <emunson@...bm.net>,
	<penberg@...nel.org>, <aarcange@...hat.com>, <riel@...hat.com>,
	<mel@....ul.ie>, <dima@...roid.com>, <rebecca@...roid.com>,
	<san@...gle.com>, <akpm@...ux-foundation.org>,
	<vesa.jaaskelainen@...ia.com>
Subject: RE: [PATCH 3.2.0-rc1 3/3] Used Memory Meter pseudo-device module

> -----Original Message-----
> From: ext David Rientjes [mailto:rientjes@...gle.com]
> Sent: 11 January, 2012 22:45
 
> I think you're misunderstanding the proposal; in the case of a global oom
> (that means without memcg) then, by definition, all threads that are
> allocating memory would be frozen and incur the delay at the point they
> would currently call into the oom killer.  If your userspace is alive, i.e. the
> application responsible for managing oom killing, then it can wait on
> eventfd(2), wake up, and then send SIGTERM and SIGKILL to the appropriate
> threads based on priority.
> 
> So, again, why wouldn't this work for you?

As I wrote the proposed change is not safety belt but looking ahead radar.
If it detects that we are close to wall it starts to alarm and alarm volume is proportional to distance.

In close-to-OOM situations device becomes very slow, which is not good for user. The performance difference depends on code size and storage performance 
to trash code pages but even 20% is noticeable. Practically 2x-5x times slowdown was observed.

We can do some actions ahead of time and try to prevent OOM at all like shrink caches in applications, close unused apps etc.  If OOM still happened due to 
3rd party components or misbehaving software even default OOM killer works good enough if oom_score_adj values are properly set.

Thus, controlling device on wider set of memory situations looks for me more beneficial than trying to  recover when situation is bad. And increasing complexity
of recovery mechanism (OOM, Android OOM, OOM with delay), involving user-space into decision-making, makes recovery _potentially_ less predictable.

Best Wishes,
Leonid

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ