[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vxx2o2wdcqjkxauglu7ul52mygu4tti2i3yc2dvmcbzydvgvu2@knujflwtakni>
Date: Wed, 21 Feb 2024 19:34:44 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Kees Cook <keescook@...omium.org>
Cc: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org,
mhocko@...e.com, vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev,
mgorman@...e.de, dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com,
penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com, peterz@...radead.org,
juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org, arnd@...db.de,
tglx@...utronix.de, mingo@...hat.com, dave.hansen@...ux.intel.com, x86@...nel.org,
peterx@...hat.com, david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org,
masahiroy@...nel.org, nathan@...nel.org, dennis@...nel.org, tj@...nel.org,
muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org, pasha.tatashin@...een.com,
yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com, hughd@...gle.com,
andreyknvl@...il.com, ndesaulniers@...gle.com, vvvvvv@...gle.com,
gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com,
elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com, minchan@...gle.com,
kaleshsingh@...gle.com, kernel-team@...roid.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, linux-modules@...r.kernel.org,
kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v4 14/36] lib: add allocation tagging support for memory
allocation profiling
On Wed, Feb 21, 2024 at 04:25:02PM -0800, Kees Cook wrote:
> On Wed, Feb 21, 2024 at 06:29:17PM -0500, Kent Overstreet wrote:
> > On Wed, Feb 21, 2024 at 03:05:32PM -0800, Kees Cook wrote:
> > > On Wed, Feb 21, 2024 at 11:40:27AM -0800, Suren Baghdasaryan wrote:
> > > > [...]
> > > > +struct alloc_tag {
> > > > + struct codetag ct;
> > > > + struct alloc_tag_counters __percpu *counters;
> > > > +} __aligned(8);
> > > > [...]
> > > > +#define DEFINE_ALLOC_TAG(_alloc_tag) \
> > > > + static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr); \
> > > > + static struct alloc_tag _alloc_tag __used __aligned(8) \
> > > > + __section("alloc_tags") = { \
> > > > + .ct = CODE_TAG_INIT, \
> > > > + .counters = &_alloc_tag_cntr };
> > > > [...]
> > > > +static inline struct alloc_tag *alloc_tag_save(struct alloc_tag *tag)
> > > > +{
> > > > + swap(current->alloc_tag, tag);
> > > > + return tag;
> > > > +}
> > >
> > > Future security hardening improvement idea based on this infrastructure:
> > > it should be possible to implement per-allocation-site kmem caches. For
> > > example, we could create:
> > >
> > > struct alloc_details {
> > > u32 flags;
> > > union {
> > > u32 size; /* not valid after __init completes */
> > > struct kmem_cache *cache;
> > > };
> > > };
> > >
> > > - add struct alloc_details to struct alloc_tag
> > > - move the tags section into .ro_after_init
> > > - extend alloc_hooks() to populate flags and size:
> > > .flags = __builtin_constant_p(size) ? KMALLOC_ALLOCATE_FIXED
> > > : KMALLOC_ALLOCATE_BUCKETS;
> > > .size = __builtin_constant_p(size) ? size : SIZE_MAX;
> > > - during kernel start or module init, walk the alloc_tag list
> > > and create either a fixed-size kmem_cache or to allocate a
> > > full set of kmalloc-buckets, and update the "cache" member.
> > > - adjust kmalloc core routines to use current->alloc_tag->cache instead
> > > of using the global buckets.
> > >
> > > This would get us fully separated allocations, producing better than
> > > type-based levels of granularity, exceeding what we have currently with
> > > CONFIG_RANDOM_KMALLOC_CACHES.
> > >
> > > Does this look possible, or am I misunderstanding something in the
> > > infrastructure being created here?
> >
> > Definitely possible, but... would we want this?
>
> Yes, very very much. One of the worst and mostly unaddressed weaknesses
> with the kernel right now is use-after-free based type confusion[0], which
> depends on merged caches (or cache reuse).
>
> This doesn't solve cross-allocator (kmalloc/page_alloc) type confusion
> (as terrifyingly demonstrated[1] by Jann Horn), but it does help with
> what has been a very common case of "use msg_msg to impersonate your
> target object"[2] exploitation.
We have a ton of code that references PAGE_SIZE and uses the page
allocator completely unnecessarily - that's something worth harping
about at conferences; if we could motivate people to clean that stuff up
it'd have a lot of positive effects.
> > That would produce a _lot_ of kmem caches
>
> Fewer than you'd expect, but yes, there is some overhead. However,
> out-of-tree forks of Linux have successfully experimented with this
> already and seen good results[3].
So in that case - I don't think there's any need for a separate
alloc_details; we'd just add a kmem_cache * to alloc_tag and then hook
into the codetag init/unload path to create and destroy the kmem caches.
No need to adjust the slab code either; alloc_hooks() itself could
dispatch to kmem_cache_alloc() instead of kmalloc() if this is in use.
Powered by blists - more mailing lists