linux-kernel - Re: [PATCH] mm, slab: extend kmalloc() alignment for non power-of-two sizes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZoRVbdCxrBwmDF2s@google.com>
Date: Tue, 2 Jul 2024 19:30:53 +0000
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: linux-mm@...ck.org, David Rientjes <rientjes@...gle.com>,
	Christoph Lameter <cl@...ux.com>,
	Hyeonggon Yoo <42.hyeyoo@...il.com>,
	Kees Cook <keescook@...omium.org>,
	Alice Ryhl <aliceryhl@...gle.com>,
	Boqun Feng <boqun.feng@...il.com>, rust-for-linux@...r.kernel.org,
	linux-kernel@...r.kernel.org, patches@...ts.linux.dev
Subject: Re: [PATCH] mm, slab: extend kmalloc() alignment for non
 power-of-two sizes

On Tue, Jul 02, 2024 at 05:58:01PM +0200, Vlastimil Babka wrote:
> Slab allocators have been guaranteeing natural alignment for
> power-of-two sizes since commit 59bb47985c1d ("mm, sl[aou]b: guarantee
> natural alignment for kmalloc(power-of-two)"), while any other sizes are
> aligned only to ARCH_KMALLOC_MINALIGN bytes.
> 
> Rust's allocator API specifies size and alignment per allocation, which
> have to satisfy the following rules, per Alice Ryhl [1]:
> 
>   1. The alignment is a power of two.
>   2. The size is non-zero.
>   3. When you round up the size to the next multiple of the alignment,
>      then it must not overflow the signed type isize / ssize_t.
> 
> In order to map this to kmalloc()'s guarantees, some requested
> allocation sizes have to be enlarged to the next power-of-two size [2].
> For example, an allocation of size 96 and alignment of 32 will be
> enlarged to an allocation of size 128, because the existing kmalloc-96
> bucket doesn't guarantee alignent above ARCH_KMALLOC_MINALIGN. Without
> slab debugging active, the layout of the kmalloc-96 slabs however
> naturally aligns the objects to 32 bytes, so extending the size to 128
> bytes is wasteful.
> 
> To improve the situation we can extend the kmalloc() alignment
> guarantees in a way that
> 
> 1) doesn't change the current slab layout (and thus does not increase
>    internal fragmentation) when slab debugging is not active
> 2) reduces waste in the Rust allocator use case
> 3) is a superset of the current guarantee for power-of-two sizes.
> 
> The extended guarantee is that alignment is at least the largest
> power-of-two divisor of the requested size. For power-of-two sizes the
> largest divisor is the size itself, but let's keep this case documented
> separately for clarity.
> 
> For current kmalloc size buckets, it means kmalloc-96 will guarantee
> alignment of 32 bytes and kmalloc-196 will guarantee 64 bytes.
> 
> This covers the rules 1 and 2 above of Rust's API as long as the size is
> a multiple of the alignment. The Rust layer should now only need to
> round up the size to the next multiple if it isn't, while enforcing the
> rule 3.
> 
> Implementation-wise, this changes the alignment calculation in
> create_boot_cache(). While at it also do the calulation only for caches
> with the SLAB_KMALLOC flag, because the function is also used to create
> the initial kmem_cache and kmem_cache_node caches, where no alignment
> guarantee is necessary.
> 
> Link: https://lore.kernel.org/all/CAH5fLggjrbdUuT-H-5vbQfMazjRDpp2%2Bk3%3DYhPyS17ezEqxwcw@mail.gmail.com/ [1]
> Link: https://lore.kernel.org/all/CAH5fLghsZRemYUwVvhk77o6y1foqnCeDzW4WZv6ScEWna2+_jw@mail.gmail.com/ [2]
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>

Hello Vlastimil,

the idea and the implementation makes total sense to me.

Do you have an estimate for the memory overhead it will typically introduce?
I don't think it will be too large though and actually can be compensated
by potential performance gains due to a better memory alignment. What do you
think?

Thanks!