[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc301463-da43-4991-b001-d92521384253@suse.cz>
Date: Thu, 20 Jun 2024 15:56:27 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Kees Cook <kees@...nel.org>
Cc: "GONG, Ruiqi" <gongruiqi@...weicloud.com>,
Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>, Joonsoo Kim <iamjoonsoo.kim@....com>,
jvoisin <julien.voisin@...tri.org>, Andrew Morton
<akpm@...ux-foundation.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Hyeonggon Yoo <42.hyeyoo@...il.com>, Xiu Jianfeng <xiujianfeng@...wei.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Kent Overstreet <kent.overstreet@...ux.dev>, Jann Horn <jannh@...gle.com>,
Matteo Rizzo <matteorizzo@...gle.com>, Thomas Graf <tgraf@...g.ch>,
Herbert Xu <herbert@...dor.apana.org.au>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-hardening@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v5 4/6] mm/slab: Introduce kmem_buckets_create() and
family
On 6/19/24 9:33 PM, Kees Cook wrote:
> Dedicated caches are available for fixed size allocations via
> kmem_cache_alloc(), but for dynamically sized allocations there is only
> the global kmalloc API's set of buckets available. This means it isn't
> possible to separate specific sets of dynamically sized allocations into
> a separate collection of caches.
>
> This leads to a use-after-free exploitation weakness in the Linux
> kernel since many heap memory spraying/grooming attacks depend on using
> userspace-controllable dynamically sized allocations to collide with
> fixed size allocations that end up in same cache.
>
> While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense
> against these kinds of "type confusion" attacks, including for fixed
> same-size heap objects, we can create a complementary deterministic
> defense for dynamically sized allocations that are directly user
> controlled. Addressing these cases is limited in scope, so isolating these
> kinds of interfaces will not become an unbounded game of whack-a-mole. For
> example, many pass through memdup_user(), making isolation there very
> effective.
>
> In order to isolate user-controllable dynamically-sized
> allocations from the common system kmalloc allocations, introduce
> kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce
> kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce
> kmem_buckets_alloc_track_caller() for where caller tracking is
> needed. Introduce kmem_buckets_valloc() for cases where vmalloc fallback
> is needed.
>
> This can also be used in the future to extend allocation profiling's use
> of code tagging to implement per-caller allocation cache isolation[1]
> even for dynamic allocations.
>
> Memory allocation pinning[2] is still needed to plug the Use-After-Free
> cross-allocator weakness, but that is an existing and separate issue
> which is complementary to this improvement. Development continues for
> that feature via the SLAB_VIRTUAL[3] series (which could also provide
> guard pages -- another complementary improvement).
>
> Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1]
> Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2]
> Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteorizzo@google.com/ [3]
> Signed-off-by: Kees Cook <kees@...nel.org>
> ---
> include/linux/slab.h | 13 ++++++++
> mm/slab_common.c | 78 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 91 insertions(+)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 8d0800c7579a..3698b15b6138 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -549,6 +549,11 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru,
>
> void kmem_cache_free(struct kmem_cache *s, void *objp);
>
> +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align,
> + slab_flags_t flags,
> + unsigned int useroffset, unsigned int usersize,
> + void (*ctor)(void *));
I'd drop the ctor, I can't imagine how it would be used with variable-sized
allocations. Probably also "align" doesn't make much sense since we're just
copying the kmalloc cache sizes and its implicit alignment of any
power-of-two allocations. I don't think any current kmalloc user would
suddenly need either of those as you convert it to buckets, and definitely
not any user converted automatically by the code tagging.
Powered by blists - more mailing lists