linux-kernel - Re: [PATCH 1/2] perf core: Add a kmem_cache for struct perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YEoNFA5gI0jp/zlF@hirez.programming.kicks-ass.net>
Date:   Thu, 11 Mar 2021 13:29:08 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephane Eranian <eranian@...gle.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Ian Rogers <irogers@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Namhyung Kim <namhyung@...gle.com>
Subject: Re: [PATCH 1/2] perf core: Add a kmem_cache for struct perf_event

On Thu, Mar 11, 2021 at 08:54:12PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung@...gle.com>
> 
> The kernel can allocate a lot of struct perf_event when profiling. For
> example, 256 cpu x 8 events x 20 cgroups = 40K instances of the struct
> would be allocated on a large system.
> 
> The size of struct perf_event in my setup is 1152 byte. As it's
> allocated by kmalloc, the actual allocation size would be rounded up
> to 2K.
> 
> Then there's 896 byte (~43%) of waste per instance resulting in total
> ~35MB with 40K instances. We can create a dedicated kmem_cache to
> avoid such a big unnecessary memory consumption.
> 
> With this change, I can see below (note this machine has 112 cpus).
> 
>   # grep perf_event /proc/slabinfo
>   perf_event    224    784   1152    7    2 : tunables   24   12    8 : slabdata    112    112      0
> 
> The sixth column is pages-per-slab which is 2, and the fifth column is
> obj-per-slab which is 7.  Thus actually it can use 1152 x 7 = 8064
> byte in the 8K, and wasted memory is (8192 - 8064) / 7 = ~18 byte per
> instance.
> 
> Signed-off-by: Namhyung Kim <namhyung@...nel.org>

Thanks for both!