linux-hardening - Re: [PATCH RFC] slab: support for compiler-assisted type-based slab cache partitioning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0f718809-9efc-44a3-b45e-a0297f456f7d@huawei.com>
Date: Wed, 27 Aug 2025 16:34:04 +0800
From: GONG Ruiqi <gongruiqi1@...wei.com>
To: Marco Elver <elver@...gle.com>
CC: <linux-kernel@...r.kernel.org>, <kasan-dev@...glegroups.com>, "Gustavo A.
 R. Silva" <gustavoars@...nel.org>, "Liam R. Howlett"
	<Liam.Howlett@...cle.com>, Alexander Potapenko <glider@...gle.com>, Andrew
 Morton <akpm@...ux-foundation.org>, Andrey Konovalov <andreyknvl@...il.com>,
	David Hildenbrand <david@...hat.com>, David Rientjes <rientjes@...gle.com>,
	Dmitry Vyukov <dvyukov@...gle.com>, Florent Revest <revest@...gle.com>, Harry
 Yoo <harry.yoo@...cle.com>, Jann Horn <jannh@...gle.com>, Kees Cook
	<kees@...nel.org>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Matteo Rizzo
	<matteorizzo@...gle.com>, Michal Hocko <mhocko@...e.com>, Mike Rapoport
	<rppt@...nel.org>, Nathan Chancellor <nathan@...nel.org>, Roman Gushchin
	<roman.gushchin@...ux.dev>, Suren Baghdasaryan <surenb@...gle.com>, Vlastimil
 Babka <vbabka@...e.cz>, <linux-hardening@...r.kernel.org>,
	<linux-mm@...ck.org>
Subject: Re: [PATCH RFC] slab: support for compiler-assisted type-based slab
 cache partitioning



On 8/26/2025 7:01 PM, Marco Elver wrote:
> On Tue, 26 Aug 2025 at 06:59, GONG Ruiqi <gongruiqi1@...wei.com> wrote:
>> On 8/25/2025 11:44 PM, Marco Elver wrote:
>>> ...
>>>
>>> Introduce a new mode, TYPED_KMALLOC_CACHES, which leverages Clang's
>>> "allocation tokens" via __builtin_alloc_token_infer [1].
>>>
>>> This mechanism allows the compiler to pass a token ID derived from the
>>> allocation's type to the allocator. The compiler performs best-effort
>>> type inference, and recognizes idioms such as kmalloc(sizeof(T), ...).
>>> Unlike RANDOM_KMALLOC_CACHES, this mode deterministically assigns a slab
>>> cache to an allocation of type T, regardless of allocation site.
>>>
>>> Clang's default token ID calculation is described as [1]:
>>>
>>>    TypeHashPointerSplit: This mode assigns a token ID based on the hash
>>>    of the allocated type's name, where the top half ID-space is reserved
>>>    for types that contain pointers and the bottom half for types that do
>>>    not contain pointers.
>>
>> Is a type's token id always the same across different builds? Or somehow
>> predictable? If so, the attacker could probably find out all types that
>> end up with the same id, and use some of them to exploit the buggy one.
> 
> Yes, it's meant to be deterministic and predictable. I guess this is
> the same question regarding randomness, for which it's unclear if it
> strengthens or weakens the mitigation. As I wrote elsewhere:
> 
>> Irrespective of the top/bottom split, one of the key properties to
>> retain is that allocations of type T are predictably assigned a slab
>> cache. This means that even if a pointer-containing object of type T
>> is vulnerable, yet the pointer within T is useless for exploitation,
>> the difficulty of getting to a sensitive object S is still increased
>> by the fact that S is unlikely to be co-located. If we were to
>> introduce more randomness, we increase the probability that S will be
>> co-located with T, which is counter-intuitive to me.

I'm interested in such topic. Let's discuss multiple situations here.

If S doesn't contains a pointer member, then your pointer-containing
object isolation completely separates S against T. No problem, and
nothing to do with randomness.

If S does, then whether they co-locate is completely based on the token
algorithm, which has two problems: 1. The result is deterministic and so
can be known by everyone including the attacker, so the attacker could
analyze the code and try to find out an S suitable for being exploited.
And 2. once such T & S exist, we can't interfere in the algorithm, and
the defense fails for all builds (of the same or nearby kernel versions
at least).

Here I think randomness could help: its value is not just about
separating things based on probability, but more about blinding the
attacker. In this scenario, with randomness we could let the attacker
unable to find out the suitable S, so they couldn't exploit it even
though such S & T exist. As you mentioned (somewhere else), the attacker
might still be able to "take off the eye mask" and locate S & T by some
other methods, e.g. analyzing the resource information at runtime, but
that's not randomness to blame. We could do something else about that
(e.g. show less for random-candidate slab caches), and that's another story.

> 
> I think we can reason either way, and I grant you this is rather ambiguous.
> 
> But the definitive point that was made to me from various security
> researchers that inspired this technique is that the most useful thing
> we can do is separate pointer-containing objects from
> non-pointer-containing objects (in absence of slab per type, which is
> likely too costly in the common case).

Isolating pointer-containing objects is the key point indeed. And for me
it's orthogonal with randomness, and they can be combined to achieve
better hardening solutions.