[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240215182729.659f3f1c@gandalf.local.home>
Date: Thu, 15 Feb 2024 18:27:29 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Vlastimil Babka <vbabka@...e.cz>, Suren Baghdasaryan
<surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
akpm@...ux-foundation.org, hannes@...xchg.org, roman.gushchin@...ux.dev,
mgorman@...e.de, dave@...olabs.net, willy@...radead.org,
liam.howlett@...cle.com, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev,
rppt@...nel.org, paulmck@...nel.org, pasha.tatashin@...een.com,
yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com,
hughd@...gle.com, andreyknvl@...il.com, keescook@...omium.org,
ndesaulniers@...gle.com, vvvvvv@...gle.com, gregkh@...uxfoundation.org,
ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, bsegall@...gle.com, bristot@...hat.com,
vschneid@...hat.com, cl@...ux.com, penberg@...nel.org,
iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com,
elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v3 31/35] lib: add memory allocations report in
show_mem()
On Thu, 15 Feb 2024 18:16:48 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:
> On Thu, 15 Feb 2024 18:07:42 -0500
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > text data bss dec hex filename
> > 29161847 18352730 5619716 53134293 32ac3d5 vmlinux.orig
> > 29162286 18382638 5595140 53140064 32ada60 vmlinux.memtag-off (+5771)
> > 29230868 18887662 5275652 53394182 32ebb06 vmlinux.memtag (+259889)
> > 29230746 18887662 5275652 53394060 32eba8c vmlinux.memtag-default-on (+259767) dropped?
> > 29276214 18946374 5177348 53399936 32ed180 vmlinux.memtag-debug (+265643)
>
> If you plan on running this in production, and this increases the size of
> the text by 68k, have you measured the I$ pressure that this may induce?
> That is, what is the full overhead of having this enabled, as it could
> cause more instruction cache misses?
>
> I wonder if there has been measurements of it off. That is, having this
> configured in but default off still increases the text size by 68k. That
> can't be good on the instruction cache.
>
I should have read the cover letter ;-) (someone pointed me to that on IRC):
> Performance overhead:
> To evaluate performance we implemented an in-kernel test executing
> multiple get_free_page/free_page and kmalloc/kfree calls with allocation
> sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
> affinity set to a specific CPU to minimize the noise. Below are results
> from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
> 56 core Intel Xeon:
These are micro benchmarks, were any larger benchmarks taken? As
microbenchmarks do not always show I$ issues (because the benchmark itself
will warm up the cache). The cache issue could slow down tasks at a bigger
picture, as it can cause more cache misses.
Running other benchmarks under perf and recording the cache misses between
the different configs would be a good picture to show.
>
> kmalloc pgalloc
> (1 baseline) 6.764s 16.902s
> (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%)
> (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%)
> (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%)
> (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%)
>
> Memory overhead:
> Kernel size:
>
> text data bss dec diff
> (1) 26515311 18890222 17018880 62424413
> (2) 26524728 19423818 16740352 62688898 264485
> (3) 26524724 19423818 16740352 62688894 264481
> (4) 26524728 19423818 16740352 62688898 264485
> (5) 26541782 18964374 16957440 62463596 39183
Similar to my builds.
>
> Memory consumption on a 56 core Intel CPU with 125GB of memory:
> Code tags: 192 kB
> PageExts: 262144 kB (256MB)
> SlabExts: 9876 kB (9.6MB)
> PcpuExts: 512 kB (0.5MB)
>
> Total overhead is 0.2% of total memory.
All this, and we are still worried about 4k for useful debugging :-/
-- Steve
Powered by blists - more mailing lists