[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170117235411.9408-10-tj@kernel.org>
Date: Tue, 17 Jan 2017 15:54:10 -0800
From: Tejun Heo <tj@...nel.org>
To: vdavydov.dev@...il.com, cl@...ux.com, penberg@...nel.org,
rientjes@...gle.com, iamjoonsoo.kim@....com,
akpm@...ux-foundation.org
Cc: jsvana@...com, hannes@...xchg.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, cgroups@...r.kernel.org, kernel-team@...com,
Tejun Heo <tj@...nel.org>
Subject: [PATCH 09/10] slab: remove slub sysfs interface files early for empty memcg caches
With kmem cgroup support enabled, kmem_caches can be created and
destroyed frequently and a great number of near empty kmem_caches can
accumulate if there are a lot of transient cgroups and the system is
not under memory pressure. When memory reclaim starts under such
conditions, it can lead to consecutive deactivation and destruction of
many kmem_caches, easily hundreds of thousands on moderately large
systems, exposing scalability issues in the current slab management
code. This is one of the patches to address the issue.
Each cache has a number of sysfs interface files under
/sys/kernel/slab. On a system with a lot of memory and transient
memcgs, the number of interface files which have to be removed once
memory reclaim kicks in can reach millions.
Signed-off-by: Tejun Heo <tj@...nel.org>
Reported-by: Jay Vana <jsvana@...com>
Acked-by: Vladimir Davydov <vdavydov.dev@...il.com>
Cc: Christoph Lameter <cl@...ux.com>
Cc: Pekka Enberg <penberg@...nel.org>
Cc: David Rientjes <rientjes@...gle.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
---
mm/slub.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 184f80b..5bffa1f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3951,8 +3951,20 @@ int __kmem_cache_shrink(struct kmem_cache *s)
#ifdef CONFIG_MEMCG
static void kmemcg_cache_deact_after_rcu(struct kmem_cache *s)
{
- /* called with all the locks held after a sched RCU grace period */
- __kmem_cache_shrink(s);
+ /*
+ * Called with all the locks held after a sched RCU grace period.
+ * Even if @s becomes empty after shrinking, we can't know that @s
+ * doesn't have allocations already in-flight and thus can't
+ * destroy @s until the associated memcg is released.
+ *
+ * However, let's remove the sysfs files for empty caches here.
+ * Each cache has a lot of interface files which aren't
+ * particularly useful for empty draining caches; otherwise, we can
+ * easily end up with millions of unnecessary sysfs files on
+ * systems which have a lot of memory and transient cgroups.
+ */
+ if (!__kmem_cache_shrink(s))
+ sysfs_slab_remove(s);
}
void __kmemcg_cache_deactivate(struct kmem_cache *s)
@@ -5651,6 +5663,15 @@ static void sysfs_slab_remove(struct kmem_cache *s)
*/
return;
+ if (!s->kobj.state_in_sysfs)
+ /*
+ * For a memcg cache, this may be called during
+ * deactivation and again on shutdown. Remove only once.
+ * A cache is never shut down before deactivation is
+ * complete, so no need to worry about synchronization.
+ */
+ return;
+
#ifdef CONFIG_MEMCG
kset_unregister(s->memcg_kset);
#endif
--
2.9.3
Powered by blists - more mailing lists