linux-kernel - Re: [PATCH -mm 00/14] Per memcg slab shrinkers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140922070446.GD32416@esperanza>
Date:	Mon, 22 Sep 2014 11:04:46 +0400
From:	Vladimir Davydov <vdavydov@...allels.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...e.cz>,
	Greg Thelen <gthelen@...gle.com>,
	Dave Chinner <david@...morbit.com>,
	Glauber Costa <glommer@...il.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<cgroups@...r.kernel.org>
Subject: Re: [PATCH -mm 00/14] Per memcg slab shrinkers

Hi Tejun,

On Sun, Sep 21, 2014 at 12:00:12PM -0400, Tejun Heo wrote:
> On Sun, Sep 21, 2014 at 07:14:32PM +0400, Vladimir Davydov wrote:
> ...
> > list. This is really important, because this allows us to release
> > memcg_cache_id used for indexing in per-memcg arrays. If we don't do
> > this, the arrays will grow uncontrollably, which is really bad. Note, in
> > comparison to user memory reparenting, which Johannes is going to get
> 
> I don't know the code well and haven't read the patches and could
> easilya be completely off the mark, but, if the size of slab array is
> the only issue, wouldn't it be easier to separate that part out?  The
> indexing is only necessary for allocating new items, right?  Can't
> that part be shutdown and the index freed on offline and the rest stay
> till release?  

That's exactly what I did in this set. I release the cache index on css
offline, but I don't reparent kmem charges, or merge kmem slabs, or
whatever, because we only need to index caches on allocations. So kmem
caches corresponding to a dead memory cgroup will be hanging around
until the last object is freed. And they will be holding a css reference
just like swap charges do and user memory charges will after Johannes'
rework.

However, we still need the cache index for list_lru, which is made
per-memcg in this patch set. The point is the objects accounted to a
memory cgroup can be added/removed from lru lists even after the cgroup
death. If we just set the cache index of a dead cgroup to its parent's,
the objects will be added/removed from an active list_lru, but there
still might be some objects left on the list_lru of the dead cgroup. We
have to move them in order to release the cache index.

> Things like reparenting tends to add fair amount of complexity and hot
> path overheads which aren't necessary otherwise.

There is no overhead added to hot paths *due to reparenting* in this set
AFAIU. And the code is way simpler than that of the user charges
reparenting, because the usage scenario of list_lru is much simpler than
that of lruvec. E.g. we don't have to retry, we are guaranteed to
succeed after the first scan. Just look at the two function doing the
stuff - memcg_reparent_all_list_lrus and reparent_memcg_lru in patch 13
- ain't they complex?

I'd like to emphasize this once again - the reparenting I'm talking
about is not about charges, we're not waiting until kmem res drops to 0.
I just move list_lru items from one list to another on css offline, w/o
retries, waiting, or some weird checks here and there.

Thanks,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/