lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 16 Sep 2014 10:34:55 +0900 From: Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com> To: Johannes Weiner <hannes@...xchg.org>, Vladimir Davydov <vdavydov@...allels.com> CC: Michal Hocko <mhocko@...e.cz>, Greg Thelen <gthelen@...gle.com>, Hugh Dickins <hughd@...gle.com>, Motohiro Kosaki <Motohiro.Kosaki@...fujitsu.com>, Glauber Costa <glommer@...il.com>, Tejun Heo <tj@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, Pavel Emelianov <xemul@...allels.com>, Konstantin Khorenko <khorenko@...allels.com>, LKML-MM <linux-mm@...ck.org>, LKML-cgroups <cgroups@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [RFC] memory cgroup: my thoughts on memsw (2014/09/16 4:14), Johannes Weiner wrote: > Hi Vladimir, > > On Thu, Sep 04, 2014 at 06:30:55PM +0400, Vladimir Davydov wrote: >> To sum it up, the current mem + memsw configuration scheme doesn't allow >> us to limit swap usage if we want to partition the system dynamically >> using soft limits. Actually, it also looks rather confusing to me. We >> have mem limit and mem+swap limit. I bet that from the first glance, an >> average admin will think it's possible to limit swap usage by setting >> the limits so that the difference between memory.memsw.limit and >> memory.limit equals the maximal swap usage, but (surprise!) it isn't >> really so. It holds if there's no global memory pressure, but otherwise >> swap usage is only limited by memory.memsw.limit! IMHO, it isn't >> something obvious. > > Agreed, memory+swap accounting & limiting is broken. > >> - Anon memory is handled by the user application, while file caches are >> all on the kernel. That means the application will *definitely* die >> w/o anon memory. W/o file caches it usually can survive, but the more >> caches it has the better it feels. >> >> - Anon memory is not that easy to reclaim. Swap out is a really slow >> process, because data are usually read/written w/o any specific >> order. Dropping file caches is much easier. Typically we have lots of >> clean pages there. >> >> - Swap space is limited. And today, it's OK to have TBs of RAM and only >> several GBs of swap. Customers simply don't want to waste their disk >> space on that. > >> Finally, my understanding (may be crazy!) how the things should be >> configured. Just like now, there should be mem_cgroup->res accounting >> and limiting total user memory (cache+anon) usage for processes inside >> cgroups. This is where there's nothing to do. However, mem_cgroup->memsw >> should be reworked to account *only* memory that may be swapped out plus >> memory that has been swapped out (i.e. swap usage). > > But anon pages are not a resource, they are a swap space liability. > Think of virtual memory vs. physical pages - the use of one does not > necessarily result in the use of the other. Without memory pressure, > anonymous pages do not consume swap space. > > What we *should* be accounting and limiting here is the actual finite > resource: swap space. Whenever we try to swap a page, its owner > should be charged for the swap space - or the swapout be rejected. > > For hard limit reclaim, the semantics of a swap space limit would be > fairly obvious, because it's clear who the offender is. > > However, in an overcommitted machine, the amount of swap space used by > a particular group depends just as much on the behavior of the other > groups in the system, so the per-group swap limit should be enforced > even during global reclaim to feed back pressure on whoever is causing > the swapout. If reclaim fails, the global OOM killer triggers, which > should then off the group with the biggest soft limit excess. > > As far as implementation goes, it should be doable to try-charge from > add_to_swap() and keep the uncharging in swap_entry_free(). > > We'll also have to extend the global OOM killer to be memcg-aware, but > we've been meaning to do that anyway. > When we introduced memsw limitation, we tried to avoid affecting global memory reclaim. Then, we did memory+swap limitation. Now, global memory reclaim is memcg-aware. So, I think swap-limitation rather than anon+swap may be a choice. The change will reduce res_counter access. Hmm, it will be desireble to move anon pages to Unevictable if memcg's swap slot is 0. Anyway, I think softlimit should be re-implemented, 1st. It will be starting point. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists