[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <161598470492.398.3094442077954239689.tip-bot2@tip-bot2>
Date: Wed, 17 Mar 2021 12:38:24 -0000
From: "tip-bot2 for Namhyung Kim" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Namhyung Kim <namhyung@...nel.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: [tip: perf/core] perf core: Add a kmem_cache for struct perf_event
The following commit has been merged into the perf/core branch of tip:
Commit-ID: bdacfaf26da166dd56c62f23f27a4b3e71f2d89e
Gitweb: https://git.kernel.org/tip/bdacfaf26da166dd56c62f23f27a4b3e71f2d89e
Author: Namhyung Kim <namhyung@...gle.com>
AuthorDate: Thu, 11 Mar 2021 20:54:12 +09:00
Committer: Peter Zijlstra <peterz@...radead.org>
CommitterDate: Tue, 16 Mar 2021 21:44:42 +01:00
perf core: Add a kmem_cache for struct perf_event
The kernel can allocate a lot of struct perf_event when profiling. For
example, 256 cpu x 8 events x 20 cgroups = 40K instances of the struct
would be allocated on a large system.
The size of struct perf_event in my setup is 1152 byte. As it's
allocated by kmalloc, the actual allocation size would be rounded up
to 2K.
Then there's 896 byte (~43%) of waste per instance resulting in total
~35MB with 40K instances. We can create a dedicated kmem_cache to
avoid such a big unnecessary memory consumption.
With this change, I can see below (note this machine has 112 cpus).
# grep perf_event /proc/slabinfo
perf_event 224 784 1152 7 2 : tunables 24 12 8 : slabdata 112 112 0
The sixth column is pages-per-slab which is 2, and the fifth column is
obj-per-slab which is 7. Thus actually it can use 1152 x 7 = 8064
byte in the 8K, and wasted memory is (8192 - 8064) / 7 = ~18 byte per
instance.
Signed-off-by: Namhyung Kim <namhyung@...nel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Link: https://lkml.kernel.org/r/20210311115413.444407-1-namhyung@kernel.org
---
kernel/events/core.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 03db40f..f526ddb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -405,6 +405,7 @@ static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
static struct srcu_struct pmus_srcu;
static cpumask_var_t perf_online_mask;
+static struct kmem_cache *perf_event_cache;
/*
* perf event paranoia level:
@@ -4611,7 +4612,7 @@ static void free_event_rcu(struct rcu_head *head)
if (event->ns)
put_pid_ns(event->ns);
perf_event_free_filter(event);
- kfree(event);
+ kmem_cache_free(perf_event_cache, event);
}
static void ring_buffer_attach(struct perf_event *event,
@@ -11293,7 +11294,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
return ERR_PTR(-EINVAL);
}
- event = kzalloc(sizeof(*event), GFP_KERNEL);
+ event = kmem_cache_zalloc(perf_event_cache, GFP_KERNEL);
if (!event)
return ERR_PTR(-ENOMEM);
@@ -11497,7 +11498,7 @@ err_ns:
put_pid_ns(event->ns);
if (event->hw.target)
put_task_struct(event->hw.target);
- kfree(event);
+ kmem_cache_free(perf_event_cache, event);
return ERR_PTR(err);
}
@@ -13130,6 +13131,8 @@ void __init perf_event_init(void)
ret = init_hw_breakpoint();
WARN(ret, "hw_breakpoint initialization failed with: %d", ret);
+ perf_event_cache = KMEM_CACHE(perf_event, SLAB_PANIC);
+
/*
* Build time assertion that we keep the data_head at the intended
* location. IOW, validation we got the __reserved[] size right.
Powered by blists - more mailing lists