linux-kernel - Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xr934mzt4rwc.fsf@gthelen.mtv.corp.google.com>
Date:	Mon, 09 Jun 2014 15:52:51 -0700
From:	Greg Thelen <gthelen@...gle.com>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Hugh Dickins <hughd@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Michel Lespinasse <walken@...gle.com>,
	Tejun Heo <tj@...nel.org>,
	Roman Gushchin <klamm@...dex-team.ru>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim

On Fri, Jun 06 2014, Michal Hocko <mhocko@...e.cz> wrote:

> Some users (e.g. Google) would like to have stronger semantic than low
> limit offers currently. The fallback mode is not desirable and they
> prefer hitting OOM killer rather than ignoring low limit for protected
> groups. There are other possible usecases which can benefit from hard
> guarantees. I can imagine workloads where setting low_limit to the same
> value as hard_limit to prevent from any reclaim at all makes a lot of
> sense because reclaim is much more disrupting than restart of the load.
>
> This patch adds a new per memcg memory.reclaim_strategy knob which
> tells what to do in a situation when memory reclaim cannot do any
> progress because all groups in the reclaimed hierarchy are within their
> low_limit. There are two options available:
> 	- low_limit_best_effort - the current mode when reclaim falls
> 	  back to the even reclaim of all groups in the reclaimed
> 	  hierarchy
> 	- low_limit_guarantee - groups within low_limit are never
> 	  reclaimed and OOM killer is triggered instead. OOM message
> 	  will mention the fact that the OOM was triggered due to
> 	  low_limit reclaim protection.

To (a) be consistent with existing hard and soft limits APIs and (b)
allow use of both best effort and guarantee memory limits, I wonder if
it's best to offer three per memcg limits, rather than two limits (hard,
low_limit) and a related reclaim_strategy knob.  The three limits I'm
thinking about are:

1) hard_limit (aka the existing limit_in_bytes cgroupfs file).  No
   change needed here.  This is an upper bound on a memcg hierarchy's
   memory consumption (assuming use_hierarchy=1).

2) best_effort_limit (aka desired working set).  This allow an
   application or administrator to provide a hint to the kernel about
   desired working set size.  Before oom'ing the kernel is allowed to
   reclaim below this limit.  I think the current soft_limit_in_bytes
   claims to provide this.  If we prefer to deprecate
   soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a
   hopefully better named) API seems reasonable.

3) low_limit_guarantee which is a lower bound of memory usage.  A memcg
   would prefer to be oom killed rather than operate below this
   threshold.  Default value is zero to preserve compatibility with
   existing apps.

Logically hard_limit >= best_effort_limit >= low_limit_guarantee.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/