lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <uhagqnpumyyqsnf4qj3fxm62i6la47yknuj4ngp6vfi7hqcwsy@lm46eypwe2lp>
Date: Thu, 15 Feb 2024 19:32:38 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Vlastimil Babka <vbabka@...e.cz>, 
	Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, akpm@...ux-foundation.org, 
	hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de, dave@...olabs.net, 
	willy@...radead.org, liam.howlett@...cle.com, corbet@....net, void@...ifault.com, 
	peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org, 
	arnd@...db.de, tglx@...utronix.de, mingo@...hat.com, 
	dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com, david@...hat.com, 
	axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org, nathan@...nel.org, 
	dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev, rppt@...nel.org, 
	paulmck@...nel.org, pasha.tatashin@...een.com, yosryahmed@...gle.com, 
	yuzhao@...gle.com, dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com, 
	keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com, 
	gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com, 
	vincent.guittot@...aro.org, dietmar.eggemann@....com, bsegall@...gle.com, bristot@...hat.com, 
	vschneid@...hat.com, cl@...ux.com, penberg@...nel.org, iamjoonsoo.kim@....com, 
	42.hyeyoo@...il.com, glider@...gle.com, elver@...gle.com, dvyukov@...gle.com, 
	shakeelb@...gle.com, songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com, 
	minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com, 
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, iommu@...ts.linux.dev, 
	linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, 
	linux-modules@...r.kernel.org, kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v3 31/35] lib: add memory allocations report in show_mem()

On Thu, Feb 15, 2024 at 07:21:41PM -0500, Steven Rostedt wrote:
> On Thu, 15 Feb 2024 18:51:41 -0500
> Kent Overstreet <kent.overstreet@...ux.dev> wrote:
> 
> > Most of that is data (505024), not text (68582, or 66k).
> > 
> 
> And the 4K extra would have been data too.

"It's not that much" isn't an argument for being wasteful.

> > The data is mostly the alloc tags themselves (one per allocation
> > callsite, and you compiled the entire kernel), so that's expected.
> > 
> > Of the text, a lot of that is going to be slowpath stuff - module load
> > and unload hooks, formatt and printing the output, other assorted bits.
> > 
> > Then there's Allocation and deallocating obj extensions vectors - not
> > slowpath but not super fast path, not every allocation.
> > 
> > The fastpath instruction count overhead is pretty small
> >  - actually doing the accounting - the core of slub.c, page_alloc.c,
> >    percpu.c
> >  - setting/restoring the alloc tag: this is overhead we add to every
> >    allocation callsite, so it's the most relevant - but it's just a few
> >    instructions.
> > 
> > So that's the breakdown. Definitely not zero overhead, but that fixed
> > memory overhead (and additionally, the percpu counters) is the price we
> > pay for very low runtime CPU overhead.
> 
> But where are the benchmarks that are not micro-benchmarks. How much
> overhead does this cause to those? Is it in the noise, or is it noticeable?

Microbenchmarks are how we magnify the effect of a change like this to
the most we'll ever see. Barring cache effects, it'll be in the noise.

Cache effects are a concern here because we're now touching task_struct
in the allocation fast path; that is where the
"compiled-in-but-turned-off" overhead comes from, because we can't add
static keys for that code without doubling the amount of icache
footprint, and I don't think that would be a great tradeoff.

So: if your code has fastpath allocations where the hot part of
task_struct isn't in cache, then this will be noticeable overhead to
you, otherwise it won't be.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ