[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1b07c7ec-b95e-7db2-6404-eb8210162fbc@suse.cz>
Date: Tue, 24 Nov 2020 19:27:19 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Topi Miettinen <toiwoton@...il.com>,
linux-hardening@...r.kernel.org, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: Jann Horn <jannh@...gle.com>, Kees Cook <keescook@...omium.org>,
Matthew Wilcox <willy@...radead.org>,
Mike Rapoport <rppt@...nel.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH v4] mm: Optional full ASLR for mmap() and mremap()
Please CC linux-api on future versions.
On 10/26/20 5:05 PM, Topi Miettinen wrote:
> Writing a new value of 3 to /proc/sys/kernel/randomize_va_space
> enables full randomization of memory mappings created with mmap(NULL,
> ...). With 2, the base of the VMA used for such mappings is random,
> but the mappings are created in predictable places within the VMA and
> in sequential order. With 3, new VMAs are created to fully randomize
> the mappings. Also mremap(..., MREMAP_MAYMOVE) will move the mappings
> even if not necessary.
>
> The method is to randomize the new address without considering
> VMAs. If the address fails checks because of overlap with the stack
> area (or in case of mremap(), overlap with the old mapping), the
> operation is retried a few times before falling back to old method.
>
> On 32 bit systems this may cause problems due to increased VM
> fragmentation if the address space gets crowded.
>
> On all systems, it will reduce performance and increase memory
> usage due to less efficient use of page tables and inability to
> merge adjacent VMAs with compatible attributes.
>
> In this example with value of 2, dynamic loader, libc, anonymous
> memory reserved with mmap() and locale-archive are located close to
> each other:
>
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 58c1175b1000-58c1175b3000 r--p 00000000 fe:0c 1868624 /usr/bin/cat
> 79752ec17000-79752f179000 r--p 00000000 fe:0c 2473999 /usr/lib/locale/locale-archive
> 79752f179000-79752f279000 rw-p 00000000 00:00 0
> 79752f279000-79752f29e000 r--p 00000000 fe:0c 2402415 /usr/lib/x86_64-linux-gnu/libc-2.31.so
> 79752f43a000-79752f440000 rw-p 00000000 00:00 0
> 79752f46f000-79752f470000 r--p 00000000 fe:0c 2400484 /usr/lib/x86_64-linux-gnu/ld-2.31.so
> 79752f49b000-79752f49c000 rw-p 00000000 00:00 0
> 7ffdcad9e000-7ffdcadbf000 rw-p 00000000 00:00 0 [stack]
> 7ffdcadd2000-7ffdcadd6000 r--p 00000000 00:00 0 [vvar]
> 7ffdcadd6000-7ffdcadd8000 r-xp 00000000 00:00 0 [vdso]
>
> With 3, they are located at unrelated addresses:
> $ echo 3 > /proc/sys/kernel/randomize_va_space
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 1206a8fa000-1206a8fb000 r--p 00000000 fe:0c 2400484 /usr/lib/x86_64-linux-gnu/ld-2.31.so
> 1206a926000-1206a927000 rw-p 00000000 00:00 0
> 19174173000-19174175000 rw-p 00000000 00:00 0
> ac82f419000-ac82f519000 rw-p 00000000 00:00 0
> afa66a42000-afa66fa4000 r--p 00000000 fe:0c 2473999 /usr/lib/locale/locale-archive
> d8656ba9000-d8656bce000 r--p 00000000 fe:0c 2402415 /usr/lib/x86_64-linux-gnu/libc-2.31.so
> d8656d6a000-d8656d6e000 rw-p 00000000 00:00 0
> 5df90b712000-5df90b714000 r--p 00000000 fe:0c 1868624 /usr/bin/cat
> 7ffe1be4c000-7ffe1be6d000 rw-p 00000000 00:00 0 [stack]
> 7ffe1bf07000-7ffe1bf0b000 r--p 00000000 00:00 0 [vvar]
> 7ffe1bf0b000-7ffe1bf0d000 r-xp 00000000 00:00 0 [vdso]
>
> CC: Andrew Morton <akpm@...ux-foundation.org>
> CC: Jann Horn <jannh@...gle.com>
> CC: Kees Cook <keescook@...omium.org>
> CC: Matthew Wilcox <willy@...radead.org>
> CC: Mike Rapoport <rppt@...nel.org>
> Signed-off-by: Topi Miettinen <toiwoton@...il.com>
> ---
> v2: also randomize mremap(..., MREMAP_MAYMOVE)
> v3: avoid stack area and retry in case of bad random address (Jann
> Horn), improve description in kernel.rst (Matthew Wilcox)
> v4: use /proc/$pid/maps in the example (Mike Rapaport), CCs (Andrew
> Morton), only check randomize_va_space == 3
> ---
> Documentation/admin-guide/hw-vuln/spectre.rst | 6 ++--
> Documentation/admin-guide/sysctl/kernel.rst | 15 ++++++++++
> init/Kconfig | 2 +-
> mm/internal.h | 8 +++++
> mm/mmap.c | 30 +++++++++++++------
> mm/mremap.c | 27 +++++++++++++++++
> 6 files changed, 75 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
> index e05e581af5cf..9ea250522077 100644
> --- a/Documentation/admin-guide/hw-vuln/spectre.rst
> +++ b/Documentation/admin-guide/hw-vuln/spectre.rst
> @@ -254,7 +254,7 @@ Spectre variant 2
> left by the previous process will also be cleared.
>
> User programs should use address space randomization to make attacks
> - more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
> + more difficult (Set /proc/sys/kernel/randomize_va_space = 1, 2 or 3).
>
> 3. A virtualized guest attacking the host
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> @@ -499,8 +499,8 @@ Spectre variant 2
> more overhead and run slower.
>
> User programs should use address space randomization
> - (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
> - difficult.
> + (/proc/sys/kernel/randomize_va_space = 1, 2 or 3) to make attacks
> + more difficult.
>
> 3. VM mitigation
> ^^^^^^^^^^^^^^^^
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index d4b32cc32bb7..bc3bb74d544d 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1060,6 +1060,21 @@ that support this feature.
> Systems with ancient and/or broken binaries should be configured
> with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process
> address space randomization.
> +
> +3 Additionally enable full randomization of memory mappings created
> + with mmap(NULL, ...). With 2, the base of the VMA used for such
> + mappings is random, but the mappings are created in predictable
> + places within the VMA and in sequential order. With 3, new VMAs
> + are created to fully randomize the mappings. Also mremap(...,
> + MREMAP_MAYMOVE) will move the mappings even if not necessary.
> +
> + On 32 bit systems this may cause problems due to increased VM
> + fragmentation if the address space gets crowded.
> +
> + On all systems, it will reduce performance and increase memory
> + usage due to less efficient use of page tables and inability to
> + merge adjacent VMAs with compatible attributes.
> +
> == ===========================================================================
>
>
> diff --git a/init/Kconfig b/init/Kconfig
> index c9446911cf41..6146e2cd3b77 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1863,7 +1863,7 @@ config COMPAT_BRK
> also breaks ancient binaries (including anything libc5 based).
> This option changes the bootup default to heap randomization
> disabled, and can be overridden at runtime by setting
> - /proc/sys/kernel/randomize_va_space to 2.
> + /proc/sys/kernel/randomize_va_space to 2 or 3.
>
> On non-ancient distros (post-2000 ones) N is usually a safe choice.
>
> diff --git a/mm/internal.h b/mm/internal.h
> index c43ccdddb0f6..b964c8dbb242 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -618,4 +618,12 @@ struct migration_target_control {
> gfp_t gfp_mask;
> };
>
> +#ifndef arch_get_mmap_end
> +#define arch_get_mmap_end(addr) (TASK_SIZE)
> +#endif
> +
> +#ifndef arch_get_mmap_base
> +#define arch_get_mmap_base(addr, base) (base)
> +#endif
> +
> #endif /* __MM_INTERNAL_H */
> diff --git a/mm/mmap.c b/mm/mmap.c
> index d91ecb00d38c..3677491e999b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -47,6 +47,7 @@
> #include <linux/pkeys.h>
> #include <linux/oom.h>
> #include <linux/sched/mm.h>
> +#include <linux/elf-randomize.h>
>
> #include <linux/uaccess.h>
> #include <asm/cacheflush.h>
> @@ -73,6 +74,8 @@ const int mmap_rnd_compat_bits_max = CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX;
> int mmap_rnd_compat_bits __read_mostly = CONFIG_ARCH_MMAP_RND_COMPAT_BITS;
> #endif
>
> +#define MAX_RANDOM_MMAP_RETRIES 5
> +
> static bool ignore_rlimit_data;
> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644);
>
> @@ -206,7 +209,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
> #ifdef CONFIG_COMPAT_BRK
> /*
> * CONFIG_COMPAT_BRK can still be overridden by setting
> - * randomize_va_space to 2, which will still cause mm->start_brk
> + * randomize_va_space to >= 2, which will still cause mm->start_brk
> * to be arbitrarily shifted
> */
> if (current->brk_randomized)
> @@ -1445,6 +1448,23 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> if (mm->map_count > sysctl_max_map_count)
> return -ENOMEM;
>
> + /* Pick a random address even outside current VMAs? */
> + if (!addr && randomize_va_space == 3) {
> + int i = MAX_RANDOM_MMAP_RETRIES;
> + unsigned long max_addr = arch_get_mmap_base(addr, mm->mmap_base);
> +
> + do {
> + /* Try a few times to find a free area */
> + addr = arch_mmap_rnd();
> + if (addr >= max_addr)
> + continue;
> + addr = get_unmapped_area(file, addr, len, pgoff, flags);
> + } while (--i >= 0 && !IS_ERR_VALUE(addr));
> +
> + if (IS_ERR_VALUE(addr))
> + addr = 0;
> + }
> +
> /* Obtain the address to map to. we verify (or select) it and ensure
> * that it represents a valid section of the address space.
> */
> @@ -2142,14 +2162,6 @@ unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info)
> return addr;
> }
>
> -#ifndef arch_get_mmap_end
> -#define arch_get_mmap_end(addr) (TASK_SIZE)
> -#endif
> -
> -#ifndef arch_get_mmap_base
> -#define arch_get_mmap_base(addr, base) (base)
> -#endif
> -
> /* Get an address range which is currently unmapped.
> * For shmat() with addr=0.
> *
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 138abbae4f75..c5b2ed2bfd2d 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -24,12 +24,15 @@
> #include <linux/uaccess.h>
> #include <linux/mm-arch-hooks.h>
> #include <linux/userfaultfd_k.h>
> +#include <linux/elf-randomize.h>
>
> #include <asm/cacheflush.h>
> #include <asm/tlbflush.h>
>
> #include "internal.h"
>
> +#define MAX_RANDOM_MREMAP_RETRIES 5
> +
> static pmd_t *get_old_pmd(struct mm_struct *mm, unsigned long addr)
> {
> pgd_t *pgd;
> @@ -720,6 +723,30 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
> goto out;
> }
>
> + if ((flags & MREMAP_MAYMOVE) && randomize_va_space == 3) {
> + /*
> + * Caller is happy with a different address, so let's
> + * move even if not necessary!
> + */
> + int i = MAX_RANDOM_MREMAP_RETRIES;
> + unsigned long max_addr = arch_get_mmap_base(addr, mm->mmap_base);
> +
> + do {
> + /* Try a few times to find a free area */
> + new_addr = arch_mmap_rnd();
> + if (new_addr >= max_addr)
> + continue;
> + ret = mremap_to(addr, old_len, new_addr, new_len,
> + &locked, flags, &uf, &uf_unmap_early,
> + &uf_unmap);
> + if (!IS_ERR_VALUE(ret))
> + goto out;
> + } while (--i >= 0);
> +
> + /* Give up and try the old address */
> + new_addr = addr;
> + }
> +
> /*
> * Always allow a shrinking remap: that just unmaps
> * the unnecessary pages..
>
> base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
>
Powered by blists - more mailing lists