[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210409231842.8840-1-longman@redhat.com>
Date: Fri, 9 Apr 2021 19:18:37 -0400
From: Waiman Long <longman@...hat.com>
To: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Vlastimil Babka <vbabka@...e.cz>, Roman Gushchin <guro@...com>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <songmuchun@...edance.com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Chris Down <chris@...isdown.name>,
Yafang Shao <laoar.shao@...il.com>,
Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
Wei Yang <richard.weiyang@...il.com>,
Masayoshi Mizuma <msys.mizuma@...il.com>,
Waiman Long <longman@...hat.com>
Subject: [PATCH 0/5] mm/memcg: Reduce kmemcache memory accounting overhead
With the recent introduction of the new slab memory controller, we
eliminate the need for having separate kmemcaches for each memory
cgroup and reduce overall kernel memory usage. However, we also add
additional memory accounting overhead to each call of kmem_cache_alloc()
and kmem_cache_free().
For workloads that require a lot of kmemcache allocations and
de-allocations, they may experience performance regression as illustrated
in [1].
With a simple kernel module that performs repeated loop of 100,000,000
kmem_cache_alloc() and kmem_cache_free() of 64-byte object at module
init. The execution time to load the kernel module with and without
memory accounting were:
with accounting = 6.798s
w/o accounting = 1.758s
That is an increase of 5.04s (287%). With this patchset applied, the
execution time became 4.254s. So the memory accounting overhead is now
2.496s which is a 50% reduction.
It was found that a major part of the memory accounting overhead
is caused by the local_irq_save()/local_irq_restore() sequences in
updating local stock charge bytes and vmstat array, at least in x86
systems. There are two such sequences in kmem_cache_alloc() and two
in kmem_cache_free(). This patchset tries to reduce the use of such
sequences as much as possible. In fact, it eliminates them in the common
case. Another part of this patchset to cache the vmstat data update in
the local stock as well which also helps.
[1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u
Waiman Long (5):
mm/memcg: Pass both memcg and lruvec to mod_memcg_lruvec_state()
mm/memcg: Introduce obj_cgroup_uncharge_mod_state()
mm/memcg: Cache vmstat data in percpu memcg_stock_pcp
mm/memcg: Separate out object stock data into its own struct
mm/memcg: Optimize user context object stock access
include/linux/memcontrol.h | 14 ++-
mm/memcontrol.c | 198 ++++++++++++++++++++++++++++++++-----
mm/percpu.c | 9 +-
mm/slab.h | 32 +++---
4 files changed, 195 insertions(+), 58 deletions(-)
--
2.18.1
Powered by blists - more mailing lists