[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXcai-gyfnS7uFSRwdR2roBRY78JZ5_1sa0Ux77xzNhvA@mail.gmail.com>
Date: Mon, 30 Nov 2020 09:57:00 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Topi Miettinen <toiwoton@...il.com>
Cc: linux-hardening@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Jann Horn <jannh@...gle.com>,
Kees Cook <keescook@...omium.org>,
Matthew Wilcox <willy@...radead.org>,
Mike Rapoport <rppt@...nel.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH v5] mm: Optional full ASLR for mmap(), mremap(), vdso and stack
On Sun, Nov 29, 2020 at 1:20 PM Topi Miettinen <toiwoton@...il.com> wrote:
>
> Writing a new value of 3 to /proc/sys/kernel/randomize_va_space
> enables full randomization of memory mappings created with mmap(NULL,
> ...). With 2, the base of the VMA used for such mappings is random,
> but the mappings are created in predictable places within the VMA and
> in sequential order. With 3, new VMAs are created to fully randomize
> the mappings.
>
> Also mremap(..., MREMAP_MAYMOVE) will move the mappings even if not
> necessary and the location of stack and vdso are also randomized.
>
> The method is to randomize the new address without considering
> VMAs. If the address fails checks because of overlap with the stack
> area (or in case of mremap(), overlap with the old mapping), the
> operation is retried a few times before falling back to old method.
>
> On 32 bit systems this may cause problems due to increased VM
> fragmentation if the address space gets crowded.
>
> On all systems, it will reduce performance and increase memory usage
> due to less efficient use of page tables and inability to merge
> adjacent VMAs with compatible attributes. In the worst case,
> additional page table entries of up to 4 pages are created for each
> mapping, so with small mappings there's considerable penalty.
>
> In this example with sysctl.kernel.randomize_va_space = 2, dynamic
> loader, libc, anonymous memory reserved with mmap() and locale-archive
> are located close to each other:
>
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 5acea452d000-5acea452f000 r--p 00000000 fe:0c 1868624 /usr/bin/cat
> 74f438f90000-74f4394f2000 r--p 00000000 fe:0c 2473999 /usr/lib/locale/locale-archive
> 74f4394f2000-74f4395f2000 rw-p 00000000 00:00 0
> 74f4395f2000-74f439617000 r--p 00000000 fe:0c 2402332 /usr/lib/x86_64-linux-gnu/libc-2.31.so
> 74f4397b3000-74f4397b9000 rw-p 00000000 00:00 0
> 74f4397e5000-74f4397e6000 r--p 00000000 fe:0c 2400754 /usr/lib/x86_64-linux-gnu/ld-2.31.so
> 74f439811000-74f439812000 rw-p 00000000 00:00 0
> 7fffdca0d000-7fffdca2e000 rw-p 00000000 00:00 0 [stack]
> 7fffdcb49000-7fffdcb4d000 r--p 00000000 00:00 0 [vvar]
> 7fffdcb4d000-7fffdcb4f000 r-xp 00000000 00:00 0 [vdso]
>
> With sysctl.kernel.randomize_va_space = 3, they are located at
> unrelated addresses and the order is random:
>
> $ echo 3 > /proc/sys/kernel/randomize_va_space
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 3850520000-3850620000 rw-p 00000000 00:00 0
> 28cfb4c8000-28cfb4cc000 r--p 00000000 00:00 0 [vvar]
> 28cfb4cc000-28cfb4ce000 r-xp 00000000 00:00 0 [vdso]
> 9e74c385000-9e74c387000 rw-p 00000000 00:00 0
> a42e0233000-a42e0234000 r--p 00000000 fe:0c 2400754 /usr/lib/x86_64-linux-gnu/ld-2.31.so
> a42e025f000-a42e0260000 rw-p 00000000 00:00 0
> bea40427000-bea4044c000 r--p 00000000 fe:0c 2402332 /usr/lib/x86_64-linux-gnu/libc-2.31.so
> bea405e8000-bea405ec000 rw-p 00000000 00:00 0
> f6d446fa000-f6d44c5c000 r--p 00000000 fe:0c 2473999 /usr/lib/locale/locale-archive
> fcfbf684000-fcfbf6a5000 rw-p 00000000 00:00 0 [stack]
> 619aba62d000-619aba62f000 r--p 00000000 fe:0c 1868624 /usr/bin/cat
>
> CC: Andrew Morton <akpm@...ux-foundation.org>
> CC: Jann Horn <jannh@...gle.com>
> CC: Kees Cook <keescook@...omium.org>
> CC: Matthew Wilcox <willy@...radead.org>
> CC: Mike Rapoport <rppt@...nel.org>
> CC: Linux API <linux-api@...r.kernel.org>
> Signed-off-by: Topi Miettinen <toiwoton@...il.com>
> ---
> v2: also randomize mremap(..., MREMAP_MAYMOVE)
> v3: avoid stack area and retry in case of bad random address (Jann
> Horn), improve description in kernel.rst (Matthew Wilcox)
> v4:
> - use /proc/$pid/maps in the example (Mike Rapaport)
> - CCs (Andrew Morton)
> - only check randomize_va_space == 3
> v5: randomize also vdso and stack
> ---
> Documentation/admin-guide/hw-vuln/spectre.rst | 6 ++--
> Documentation/admin-guide/sysctl/kernel.rst | 20 +++++++++++++
> arch/x86/entry/vdso/vma.c | 26 +++++++++++++++-
> include/linux/mm.h | 8 +++++
> init/Kconfig | 2 +-
> mm/mmap.c | 30 +++++++++++++------
> mm/mremap.c | 27 +++++++++++++++++
> mm/util.c | 6 ++++
> 8 files changed, 111 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
> index e05e581af5cf..9ea250522077 100644
> --- a/Documentation/admin-guide/hw-vuln/spectre.rst
> +++ b/Documentation/admin-guide/hw-vuln/spectre.rst
> @@ -254,7 +254,7 @@ Spectre variant 2
> left by the previous process will also be cleared.
>
> User programs should use address space randomization to make attacks
> - more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
> + more difficult (Set /proc/sys/kernel/randomize_va_space = 1, 2 or 3).
>
> 3. A virtualized guest attacking the host
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> @@ -499,8 +499,8 @@ Spectre variant 2
> more overhead and run slower.
>
> User programs should use address space randomization
> - (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
> - difficult.
> + (/proc/sys/kernel/randomize_va_space = 1, 2 or 3) to make attacks
> + more difficult.
>
> 3. VM mitigation
> ^^^^^^^^^^^^^^^^
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index d4b32cc32bb7..806e3b29d2b5 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1060,6 +1060,26 @@ that support this feature.
> Systems with ancient and/or broken binaries should be configured
> with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process
> address space randomization.
> +
> +3 Additionally enable full randomization of memory mappings created
> + with mmap(NULL, ...). With 2, the base of the VMA used for such
> + mappings is random, but the mappings are created in predictable
> + places within the VMA and in sequential order. With 3, new VMAs
> + are created to fully randomize the mappings.
> +
> + Also mremap(..., MREMAP_MAYMOVE) will move the mappings even if
> + not necessary and the location of stack and vdso are also
> + randomized.
> +
> + On 32 bit systems this may cause problems due to increased VM
> + fragmentation if the address space gets crowded.
> +
> + On all systems, it will reduce performance and increase memory
> + usage due to less efficient use of page tables and inability to
> + merge adjacent VMAs with compatible attributes. In the worst case,
> + additional page table entries of up to 4 pages are created for
> + each mapping, so with small mappings there's considerable penalty.
> +
> == ===========================================================================
>
>
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 9185cb1d13b9..03ea884822e3 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
> @@ -12,6 +12,7 @@
> #include <linux/init.h>
> #include <linux/random.h>
> #include <linux/elf.h>
> +#include <linux/elf-randomize.h>
> #include <linux/cpu.h>
> #include <linux/ptrace.h>
> #include <linux/time_namespace.h>
> @@ -32,6 +33,8 @@
> const size_t name ## _offset = offset;
> #include <asm/vvar.h>
>
> +#define MAX_RANDOM_VDSO_RETRIES 5
> +
> struct vdso_data *arch_get_vdso_data(void *vvar_page)
> {
> return (struct vdso_data *)(vvar_page + _vdso_data_offset);
> @@ -361,7 +364,28 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)
>
> static int map_vdso_randomized(const struct vdso_image *image)
> {
> - unsigned long addr = vdso_addr(current->mm->start_stack, image->size-image->sym_vvar_start);
> + unsigned long addr;
> +
> + if (randomize_va_space == 3) {
> + /*
> + * Randomize vdso address.
> + */
> + int i = MAX_RANDOM_VDSO_RETRIES;
> +
> + do {
> + int ret;
> +
> + /* Try a few times to find a free area */
> + addr = arch_mmap_rnd();
> +
> + ret = map_vdso(image, addr);
> + if (!IS_ERR_VALUE(ret))
> + return ret;
> + } while (--i >= 0);
> +
> + /* Give up and try the less random way */
> + }
> + addr = vdso_addr(current->mm->start_stack, image->size-image->sym_vvar_start);
This is IMO rather ugly. You're picking random numbers and throwing
them at map_vdso(), which throws them at get_unmapped_area(), which
will validate them. And you duplicate the same ugly loop later on.
How about instead pushing this logic into get_unmapped_area()?
--Andy
Powered by blists - more mailing lists