linux-hardening - Re: [PATCH v5] Randomized slab caches for kmalloc()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e14c547e-bb3f-4ede-8f0a-dcaa548fe5af@dustri.org>
Date:   Mon, 11 Sep 2023 23:18:15 +0200
From:   jvoisin <julien.voisin@...tri.org>
To:     gongruiqi@...weicloud.com
Cc:     42.hyeyoo@...il.com, akpm@...ux-foundation.org,
        aleksander.lobakin@...el.com, cl@...ux.com, dennis@...nel.org,
        dvyukov@...gle.com, elver@...gle.com, glider@...gle.com,
        gongruiqi1@...wei.com, iamjoonsoo.kim@....com, jannh@...gle.com,
        jmorris@...ei.org, keescook@...omium.org,
        linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, paul@...l-moore.com, pedro.falcato@...il.com,
        penberg@...nel.org, rientjes@...gle.com, roman.gushchin@...ux.dev,
        serge@...lyn.com, tj@...nel.org, vbabka@...e.cz,
        wangweiyang2@...wei.com, xiujianfeng@...wei.com
Subject: Re: [PATCH v5] Randomized slab caches for kmalloc()

I wrote a small blogpost[1] about this series, and was told[2] that it
would be interesting to share it on this thread, so here it is, copied
verbatim:

Ruiqi Gong and Xiu Jianfeng got their
[Randomized slab caches for
kmalloc()](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c6152940584290668b35fa0800026f6a1ae05fe)
patch series merged upstream, and I've and enough discussions about it to
warrant summarising them into a small blogpost.

The main idea is to have multiple slab caches, and pick one at random
based on
the address of code calling `kmalloc()` and a per-boot seed, to make
heap-spraying harder.
It's a great idea, but comes with some shortcomings for now:

- Objects being allocated via wrappers around `kmalloc()`, like
`sock_kmalloc`,
  `f2fs_kmalloc`, `aligned_kmalloc`, … will end up in the same slab cache.
- The slabs needs to be pinned, otherwise an attacker could
[feng-shui](https://en.wikipedia.org/wiki/Heap_feng_shui) their way
  into having the whole slab free'ed, garbage-collected, and have a slab for
  another type allocated at the same VA. [Jann Horn](https://thejh.net/)
and [Matteo Rizzo](https://infosec.exchange/@nspace) have a [nice
  set of
patches](https://github.com/torvalds/linux/compare/master...thejh:linux:slub-virtual-upstream),
  discussed a bit in [this Project Zero
blogpost](https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html),
  for a feature called [`SLAB_VIRTUAL`](
https://github.com/torvalds/linux/commit/f3afd3a2152353be355b90f5fd4367adbf6a955e),
  implementing precisely this.
- There are 16 slabs by default, so one chance out of 16 to end up in
the same
  slab cache as the target.
- There are no guard pages between caches, so inter-caches overflows are
  possible.
- As pointed by
[andreyknvl](https://twitter.com/andreyknvl/status/1700267669336080678)
  and [minipli](https://infosec.exchange/@minipli/111045336853055793),
  the fewer allocations hitting a given cache means less noise,
  so it might even help with some heap feng-shui.
- minipli also pointed that "randomized caches still freely
  mix kernel allocations with user controlled ones (`xattr`, `keyctl`,
`msg_msg`, …).
  So even though merging is disabled for these caches, i.e. no direct
overlap
  with `cred_jar` etc., other object types can still be targeted (`struct
  pipe_buffer`, BPF maps, its verifier state objects,…). It’s just a
matter of
  probing which allocation index the targeted object falls into.",
  but I considered this out of scope, since it's much more involved;
  albeit something like
[`CONFIG_KMALLOC_SPLIT_VARSIZE`](https://github.com/thejh/linux/blob/slub-virtual/MITIGATION_README)
  wouldn't significantly increase complexity.

Also, while code addresses as a source of entropy has historically be a
great
way to provide [KASLR](https://lwn.net/Articles/569635/) bypasses,
`hash_64(caller ^
random_kmalloc_seed, ilog2(RANDOM_KMALLOC_CACHES_NR + 1))` shouldn't
trivially
leak offsets.

The segregation technique is a bit like a weaker version of grsecurity's
[AUTOSLAB](https://grsecurity.net/how_autoslab_changes_the_memory_unsafety_game),
or a weaker kernel-land version of
[PartitionAlloc](https://chromium.googlesource.com/chromium/src/+/master/base/allocator/partition_allocator/PartitionAlloc.md),
but to be fair, making use-after-free exploitation harder, and significantly
harder once pinning lands, with only ~150 lines of code and negligible
performance impact is amazing and should be praised. Moreover, I wouldn't be
surprised if this was backported in [Google's
KernelCTF](https://google.github.io/security-research/kernelctf/rules.html)
soon, so we should see if my analysis is correct.

1.
https://dustri.org/b/some-notes-on-randomized-slab-caches-for-kmalloc.html
2. https://infosec.exchange/@vbabka@social.kernel.org/111046740392510260

-- 
Julien (jvoisin) Voisin
GPG: 04D041E8171901CC
dustri.org