lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52tsrapmkfywv4kkdpravtfmxkhxchyua4wttpugihld4iws3r@atfgtbd5wwhx>
Date: Thu, 8 May 2025 00:07:40 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: David Wang <00107082@....com>
Cc: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] alloc_tag: avoid mem alloc and iter reset when reading
 allocinfo

On Thu, May 08, 2025 at 11:35:11AM +0800, David Wang wrote:
> 
> 
> At 2025-05-08 11:31:12, "Kent Overstreet" <kent.overstreet@...ux.dev> wrote:
> >On Thu, May 08, 2025 at 11:06:35AM +0800, David Wang wrote:
> >> Thanks for the feedback~
> >> I agree that memory allocation normally dose not take major part of a profiling report,
> >> even profiling a fio test, kmem_cache_alloc only takes ~1% perf samples.
> >> 
> >> I don't know why I have this "the less memory allocation, the better' mindset,  maybe
> >> I was worrying about memory fragmentation, or something else I  learned on some "textbook",
> >> To be honest, I  have never had real experience with those worries....
> >
> >It's a common bias. "Memory allocations" take up a lot of conceptual
> >space in our heads, and generally for good reason - i.e. handling memory
> >allocation errors is often a major concern, and you do always want to be
> >aware of memory layout.
> >
> >But this can turn into an aversion that's entirely disproportionate -
> >e.g. using linked linked lists and fixed size arrays in ways that are
> >entirely inappropriate, instead of vectors and other better data
> >structures; good data structures always require allocations.
> >
> >Profile, profile, profile, and remember your basic CS (big O notation) -
> >90% of the time, simple code with good big O running time is all you
> >need.
> 
> copy that~!

Another thing to note is that memory layout - avoiding pointer chasing -
is hugely important, but it'll almost never show up as allocator calls.

To give you some examples, mempools and biosets used to be separately
allocated. This was mainly to make error paths in outer object
constructors/destructors easier and safer: instead of keeping track of
what's initialized and what's not, if you've got a pointer to a
mempool/bioset you call *_free() on it.

(People hadn't yet clued that you can just kzalloc() the entire outer
object, and then if the inner object is zeroed it wasn't initialized).

But that means you're adding a pointer chase to every mempool_alloc()
call, and since bioset itself has mempools allocating bios had _two_
unnecessary pointer derefs. That's death for performance when you're
running cache cold, but since everyone benchmarks cache-hot...

(I was the one who fixed that).

Another big one was generic_file_buffered_read(). Main buffered read
path, everyone wants it to be as fast as possible.

But the core is (was) a loop that walks the pagecache radix tree to get
the page, then copies 4k of data out to userspace (there goes l1), then
repeats all that pointer chasing for the next 4k. Pre large folios, it
was horrific.

Solution - vectorize it. Look up all the pages we're copying from all at
once, stuff them in a (dynamically allocated! for each read!) vector,
and then do the copying out to userspace all at once. Massive
performance gain.

Of course, to do that I first had to clean up a tangled 250+ line
monstrosity of half baked, poorly thought out "optimizations" (the worst
spaghetti of gotos you'd ever seen) and turn it into something
manageable...

So - keep things simple, don't overthink the little stuff, so you can
spot and tackle the big algorithmic wins :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ