linux-kernel - Re: [PATCH v4 14/36] lib: add allocation tagging support for memory allocation profiling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202402211608.41AD94094@keescook>
Date: Wed, 21 Feb 2024 16:25:02 -0800
From: Kees Cook <keescook@...omium.org>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org,
	mhocko@...e.com, vbabka@...e.cz, hannes@...xchg.org,
	roman.gushchin@...ux.dev, mgorman@...e.de, dave@...olabs.net,
	willy@...radead.org, liam.howlett@...cle.com,
	penguin-kernel@...ove.sakura.ne.jp, corbet@....net,
	void@...ifault.com, peterz@...radead.org, juri.lelli@...hat.com,
	catalin.marinas@....com, will@...nel.org, arnd@...db.de,
	tglx@...utronix.de, mingo@...hat.com, dave.hansen@...ux.intel.com,
	x86@...nel.org, peterx@...hat.com, david@...hat.com,
	axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
	nathan@...nel.org, dennis@...nel.org, tj@...nel.org,
	muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
	pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com,
	dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
	ndesaulniers@...gle.com, vvvvvv@...gle.com,
	gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, bristot@...hat.com,
	vschneid@...hat.com, cl@...ux.com, penberg@...nel.org,
	iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com,
	elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
	songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
	minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
	cgroups@...r.kernel.org
Subject: Re: [PATCH v4 14/36] lib: add allocation tagging support for memory
 allocation profiling

On Wed, Feb 21, 2024 at 06:29:17PM -0500, Kent Overstreet wrote:
> On Wed, Feb 21, 2024 at 03:05:32PM -0800, Kees Cook wrote:
> > On Wed, Feb 21, 2024 at 11:40:27AM -0800, Suren Baghdasaryan wrote:
> > > [...]
> > > +struct alloc_tag {
> > > +	struct codetag			ct;
> > > +	struct alloc_tag_counters __percpu	*counters;
> > > +} __aligned(8);
> > > [...]
> > > +#define DEFINE_ALLOC_TAG(_alloc_tag)						\
> > > +	static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr);	\
> > > +	static struct alloc_tag _alloc_tag __used __aligned(8)			\
> > > +	__section("alloc_tags") = {						\
> > > +		.ct = CODE_TAG_INIT,						\
> > > +		.counters = &_alloc_tag_cntr };
> > > [...]
> > > +static inline struct alloc_tag *alloc_tag_save(struct alloc_tag *tag)
> > > +{
> > > +	swap(current->alloc_tag, tag);
> > > +	return tag;
> > > +}
> > 
> > Future security hardening improvement idea based on this infrastructure:
> > it should be possible to implement per-allocation-site kmem caches. For
> > example, we could create:
> > 
> > struct alloc_details {
> > 	u32 flags;
> > 	union {
> > 		u32 size; /* not valid after __init completes */
> > 		struct kmem_cache *cache;
> > 	};
> > };
> > 
> > - add struct alloc_details to struct alloc_tag
> > - move the tags section into .ro_after_init
> > - extend alloc_hooks() to populate flags and size:
> > 	.flags = __builtin_constant_p(size) ? KMALLOC_ALLOCATE_FIXED
> > 					    : KMALLOC_ALLOCATE_BUCKETS;
> > 	.size = __builtin_constant_p(size) ? size : SIZE_MAX;
> > - during kernel start or module init, walk the alloc_tag list
> >   and create either a fixed-size kmem_cache or to allocate a
> >   full set of kmalloc-buckets, and update the "cache" member.
> > - adjust kmalloc core routines to use current->alloc_tag->cache instead
> >   of using the global buckets.
> > 
> > This would get us fully separated allocations, producing better than
> > type-based levels of granularity, exceeding what we have currently with
> > CONFIG_RANDOM_KMALLOC_CACHES.
> > 
> > Does this look possible, or am I misunderstanding something in the
> > infrastructure being created here?
> 
> Definitely possible, but... would we want this?

Yes, very very much. One of the worst and mostly unaddressed weaknesses
with the kernel right now is use-after-free based type confusion[0], which
depends on merged caches (or cache reuse).

This doesn't solve cross-allocator (kmalloc/page_alloc) type confusion
(as terrifyingly demonstrated[1] by Jann Horn), but it does help with
what has been a very common case of "use msg_msg to impersonate your
target object"[2] exploitation.

> That would produce a _lot_ of kmem caches

Fewer than you'd expect, but yes, there is some overhead. However,
out-of-tree forks of Linux have successfully experimented with this
already and seen good results[3].

> and don't we already try to collapse those where possible to reduce
> internal fragmentation?

In the past, yes, but the desire for security has tended to have more
people building with SLAB_MERGE_DEFAULT=n and/or CONFIG_RANDOM_KMALLOC_CACHES=y
(or booting with "slab_nomerge").

Just doing the type safety isn't sufficient without the cross-allocator
safety, but we've also had solutions for that proposed[4].

-Kees

[0] https://github.com/KSPP/linux/issues/189
[1] https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
[2] https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html
    https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html#exploring-struct-msg_msg
[3] https://grsecurity.net/how_autoslab_changes_the_memory_unsafety_game
[4] https://lore.kernel.org/linux-hardening/20230915105933.495735-1-matteorizzo@google.com/

-- 
Kees Cook