linux-kernel - Re: [RFC PATCH] memcg: export knobs for the defaul cgroup hierarchy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140718154443.GM27940@esperanza>
Date:	Fri, 18 Jul 2014 19:44:43 +0400
From:	Vladimir Davydov <vdavydov@...allels.com>
To:	Johannes Weiner <hannes@...xchg.org>
CC:	Michal Hocko <mhocko@...e.cz>, <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
	Hugh Dickins <hughd@...gle.com>,
	Greg Thelen <gthelen@...gle.com>,
	Glauber Costa <glommer@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Subject: Re: [RFC PATCH] memcg: export knobs for the defaul cgroup hierarchy

On Wed, Jul 16, 2014 at 11:58:14AM -0400, Johannes Weiner wrote:
> On Wed, Jul 16, 2014 at 04:39:38PM +0200, Michal Hocko wrote:
> > +#ifdef CONFIG_MEMCG_KMEM
> > +	{
> > +		.name = "kmem.limit_in_bytes",
> > +		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> > +		.write = mem_cgroup_write,
> > +		.read_u64 = mem_cgroup_read_u64,
> > +	},
> 
> Does it really make sense to have a separate limit for kmem only?
> IIRC, the reason we introduced this was that this memory is not
> reclaimable and so we need to limit it.
> 
> But the opposite effect happened: because it's not reclaimable, the
> separate kmem limit is actually unusable for any values smaller than
> the overall memory limit: because there is no reclaim mechanism for
> that limit, once you hit it, it's over, there is nothing you can do
> anymore.  The problem isn't so much unreclaimable memory, the problem
> is unreclaimable limits.
> 
> If the global case produces memory pressure through kernel memory
> allocations, we reclaim page cache, anonymous pages, inodes, dentries
> etc.  I think the same should happen for kmem: kmem should just be
> accounted and limited in the overall memory limit of a group, and when
> pressure arises, we go after anything that's reclaimable.

Personally, I don't think there's much sense in having a separate knob
for kmem limit either. Until we have a user with a sane use case for it,
let's not propagate it to the new interface.

Furthermore, even when we introduce kmem shrinking, the kmem-only limit
alone won't be very useful, because there are plenty of GFP_NOFS kmem
allocations, which make most of slab shrinkers useless. To avoid
ENOMEM's in such situation, we would have to introduce either a soft
kmem limit (watermark) or a kind of kmem precharges. This means if we
decided to introduce kmem-only limit, we'd eventually have to add more
knobs and write more code to make it usable w/o even knowing if anyone
would really benefit from it.

However, there might be users that only want user memory limiting and
don't want to pay the price of kmem accounting, which is pretty
expensive. Even if we implement percpu stocks for kmem, there still will
be noticeable overhead due to touching more cache lines on
kmalloc/kfree.

So I guess there should be a tunable, which will allow to toggle memcg
features. May be, a bitmask for future extensibility.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/