lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 30 Oct 2014 10:27:31 -0700
From:	Kees Cook <keescook@...omium.org>
To:	Junjie Mao <eternal.n08@...il.com>
Cc:	Fengguang Wu <fengguang.wu@...el.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86, kaslr: Prevent .bss from overlaping initrd

[Adding x86 and hpa to CC]

On Thu, Oct 30, 2014 at 10:25 AM, Kees Cook <keescook@...omium.org> wrote:
> Ah! Thank you for tracking this bug down. I had never been able to reproduce it!
>
> On Thu, Oct 30, 2014 at 5:41 AM, Junjie Mao <eternal.n08@...il.com> wrote:
>> When choosing a random address, the current implementation does not take into
>> account the reversed space for .bss and .brk sections. Thus the relocated kernel
>> may overlap other components in memory, e.g. the initrd image:
>>
>> +-------------------+
>> |   decompressed    |
>> |      kernel       |
>> |   (relocated)     |
>
> Strictly speaking, the relocation table follows the decompressed
> kernel, though it is part of the decompressed data. It isn't in use
> once the kernel starts. Does (should) .bss overlap it?
>
>> +-------------------+--
>> |                   |  \
>> +-------------------+   .bss and .brk section
>> |                   |  /
>> |      initrd       |--
>> |                   |
>> +-------------------+
>>
>> Here is an example of the overlap from a x86_64 kernel in qemu (the ranges of
>> physical addresses are presented):
>>
>> compressed kernel:      [0x0449626e, 0x04e30aa3]
>> initrd:                 [0x13ce6000, 0x13fef373]
>> relocated kernel:       [0x0fe00000, 0x13c1c2bb]
>> .bss and .brk sections: [0x13c1c2bc, 0x148262bb]
>
> What did you use to instrument this? I'd be curious to see how large
> the relocation table is too.
>
>> The initrd image will then be overwritten by the memset during early
>> initialization:
>>
>> [    1.655204] Unpacking initramfs...
>> [    1.662831] Initramfs unpacking failed: junk in compressed archive
>>
>> This patch prevents the above situation by requiring a larger space when looking
>> for a random kernel base, so that existing logic can effectively avoids the
>> overlap.
>
> Yes, thank you again for tracking this down!
>
>>
>> Fixes: 82fa9637a2 ("x86, kaslr: Select random position from e820 maps")
>> Reported-by: Fengguang Wu <fengguang.wu@...el.com>
>> Signed-off-by: Junjie Mao <eternal.n08@...il.com>
>
> This should also go to stable:
>
> Cc: stable@...r.kernel.org
>
>> ---
>>  arch/x86/boot/compressed/Makefile  |  3 ++-
>>  arch/x86/boot/compressed/aslr.c    |  5 +++--
>>  arch/x86/boot/compressed/head_32.S |  3 ++-
>>  arch/x86/boot/compressed/head_64.S |  3 +++
>>  arch/x86/boot/compressed/misc.c    |  6 ++++--
>>  arch/x86/boot/compressed/misc.h    |  6 ++++--
>>  arch/x86/boot/compressed/mkpiggy.c |  8 ++++++--
>>  arch/x86/tools/calc_reserved.awk   | 21 +++++++++++++++++++++
>>  8 files changed, 45 insertions(+), 10 deletions(-)
>>  create mode 100644 arch/x86/tools/calc_reserved.awk
>>
>> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
>> index 704f58aa79cd..419e12b203d9 100644
>> --- a/arch/x86/boot/compressed/Makefile
>> +++ b/arch/x86/boot/compressed/Makefile
>> @@ -76,8 +76,9 @@ suffix-$(CONFIG_KERNEL_XZ)    := xz
>>  suffix-$(CONFIG_KERNEL_LZO)    := lzo
>>  suffix-$(CONFIG_KERNEL_LZ4)    := lz4
>>
>> +RESERVED_SIZE = $(shell objdump -h vmlinux | awk -f $(srctree)/arch/x86/tools/calc_reserved.awk)
>>  quiet_cmd_mkpiggy = MKPIGGY $@
>> -      cmd_mkpiggy = $(obj)/mkpiggy $< > $@ || ( rm -f $@ ; false )
>> +      cmd_mkpiggy = $(obj)/mkpiggy $< $(RESERVED_SIZE) > $@ || ( rm -f $@ ; false )
>>
>>  targets += piggy.S
>>  $(obj)/piggy.S: $(obj)/vmlinux.bin.$(suffix-y) $(obj)/mkpiggy FORCE
>> diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
>> index bb1376381985..d4695b022971 100644
>> --- a/arch/x86/boot/compressed/aslr.c
>> +++ b/arch/x86/boot/compressed/aslr.c
>> @@ -298,7 +298,8 @@ static unsigned long find_random_addr(unsigned long minimum,
>>  unsigned char *choose_kernel_location(unsigned char *input,
>>                                       unsigned long input_size,
>>                                       unsigned char *output,
>> -                                     unsigned long output_size)
>> +                                     unsigned long output_size,
>> +                                     unsigned long reserved_size)
>>  {
>>         unsigned long choice = (unsigned long)output;
>>         unsigned long random;
>> @@ -320,7 +321,7 @@ unsigned char *choose_kernel_location(unsigned char *input,
>>                        (unsigned long)output, output_size);
>>
>>         /* Walk e820 and find a random address. */
>> -       random = find_random_addr(choice, output_size);
>> +       random = find_random_addr(choice, output_size + reserved_size);
>>         if (!random) {
>>                 debug_putstr("KASLR could not find suitable E820 region...\n");
>>                 goto out;
>> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
>> index cbed1407a5cd..06c18f6d1f13 100644
>> --- a/arch/x86/boot/compressed/head_32.S
>> +++ b/arch/x86/boot/compressed/head_32.S
>> @@ -207,6 +207,7 @@ relocated:
>>   * Do the decompression, and jump to the new kernel..
>>   */
>>                                 /* push arguments for decompress_kernel: */
>> +       pushl   $reserved_size
>>         pushl   $z_output_len   /* decompressed length */
>>         leal    z_extract_offset_negative(%ebx), %ebp
>>         pushl   %ebp            /* output address */
>> @@ -217,7 +218,7 @@ relocated:
>>         pushl   %eax            /* heap area */
>>         pushl   %esi            /* real mode pointer */
>>         call    decompress_kernel /* returns kernel location in %eax */
>> -       addl    $24, %esp
>> +       addl    $28, %esp
>>
>>  /*
>>   * Jump to the decompressed kernel.
>> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
>> index 2884e0c3e8a5..02c518f8aca5 100644
>> --- a/arch/x86/boot/compressed/head_64.S
>> +++ b/arch/x86/boot/compressed/head_64.S
>> @@ -402,6 +402,8 @@ relocated:
>>   * Do the decompression, and jump to the new kernel..
>>   */
>>         pushq   %rsi                    /* Save the real mode argument */
>> +       movq    $reserved_size, %r9
>> +       pushq   %r9
>>         movq    %rsi, %rdi              /* real mode address */
>>         leaq    boot_heap(%rip), %rsi   /* malloc area for uncompression */
>>         leaq    input_data(%rip), %rdx  /* input_data */
>> @@ -409,6 +411,7 @@ relocated:
>>         movq    %rbp, %r8               /* output target address */
>>         movq    $z_output_len, %r9      /* decompressed length */
>>         call    decompress_kernel       /* returns kernel location in %rax */
>> +       popq    %r9
>>         popq    %rsi
>>
>>  /*
>> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
>> index 57ab74df7eea..173062d32898 100644
>> --- a/arch/x86/boot/compressed/misc.c
>> +++ b/arch/x86/boot/compressed/misc.c
>> @@ -358,7 +358,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>>                                   unsigned char *input_data,
>>                                   unsigned long input_len,
>>                                   unsigned char *output,
>> -                                 unsigned long output_len)
>> +                                 unsigned long output_len,
>> +                                 unsigned long reserved_size)
>>  {
>>         real_mode = rmode;
>>
>> @@ -382,7 +383,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
>>         free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
>>
>>         output = choose_kernel_location(input_data, input_len,
>> -                                       output, output_len);
>> +                                       output, output_len,
>> +                                       reserved_size);
>>
>>         /* Validate memory location choices. */
>>         if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
>> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
>> index 24e3e569a13c..fae4cef40f1f 100644
>> --- a/arch/x86/boot/compressed/misc.h
>> +++ b/arch/x86/boot/compressed/misc.h
>> @@ -59,7 +59,8 @@ int cmdline_find_option_bool(const char *option);
>>  unsigned char *choose_kernel_location(unsigned char *input,
>>                                       unsigned long input_size,
>>                                       unsigned char *output,
>> -                                     unsigned long output_size);
>> +                                     unsigned long output_size,
>> +                                     unsigned long reserved_size);
>>  /* cpuflags.c */
>>  bool has_cpuflag(int flag);
>>  #else
>> @@ -67,7 +68,8 @@ static inline
>>  unsigned char *choose_kernel_location(unsigned char *input,
>>                                       unsigned long input_size,
>>                                       unsigned char *output,
>> -                                     unsigned long output_size)
>> +                                     unsigned long output_size,
>> +                                     unsigned long reserved_size)
>>  {
>>         return output;
>>  }
>> diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
>> index b669ab65bf6c..a31983ced81b 100644
>> --- a/arch/x86/boot/compressed/mkpiggy.c
>> +++ b/arch/x86/boot/compressed/mkpiggy.c
>> @@ -36,11 +36,12 @@ int main(int argc, char *argv[])
>>         uint32_t olen;
>>         long ilen;
>>         unsigned long offs;
>> +       unsigned long reserved_size;
>>         FILE *f = NULL;
>>         int retval = 1;
>>
>> -       if (argc < 2) {
>> -               fprintf(stderr, "Usage: %s compressed_file\n", argv[0]);
>> +       if (argc < 3) {
>> +               fprintf(stderr, "Usage: %s compressed_file reserved_size\n", argv[0]);
>>                 goto bail;
>>         }
>>
>> @@ -74,6 +75,7 @@ int main(int argc, char *argv[])
>>         offs += olen >> 12;     /* Add 8 bytes for each 32K block */
>>         offs += 64*1024 + 128;  /* Add 64K + 128 bytes slack */
>>         offs = (offs+4095) & ~4095; /* Round to a 4K boundary */
>> +       reserved_size = atoi(argv[2]);
>>
>>         printf(".section \".rodata..compressed\",\"a\",@progbits\n");
>>         printf(".globl z_input_len\n");
>> @@ -85,6 +87,8 @@ int main(int argc, char *argv[])
>>         /* z_extract_offset_negative allows simplification of head_32.S */
>>         printf(".globl z_extract_offset_negative\n");
>>         printf("z_extract_offset_negative = -0x%lx\n", offs);
>> +       printf(".globl reserved_size\n");
>> +       printf("reserved_size = 0x%lx\n", reserved_size);
>>
>>         printf(".globl input_data, input_data_end\n");
>>         printf("input_data:\n");
>> diff --git a/arch/x86/tools/calc_reserved.awk b/arch/x86/tools/calc_reserved.awk
>> new file mode 100644
>> index 000000000000..2ca69682338a
>> --- /dev/null
>> +++ b/arch/x86/tools/calc_reserved.awk
>> @@ -0,0 +1,21 @@
>> +#!/bin/awk -f
>> +#
>> +# Calculate the amount of space that we have to reserve for .bss and .brk
>> +# sections
>> +#
>> +# Usage:
>> +# objdump -h a.out | awk -f calc_reserved.awk
>> +
>> +BEGIN {
>> +       sections = "^.bss$|^.brk$"
>> +       size = 0;
>> +}
>> +
>> +/^ *[0-9]+ [a-z._]+ *[0-9a-f]+/ {
>> +       if (match($2, sections))
>> +               size += strtonum("0x" $3);
>> +}
>> +
>> +END {
>> +       printf("%d\n", size);
>> +}
>> --
>> 1.9.3
>
> Should mkpiggy.c do the work itself instead of taking an argument?
>
> Regardless, I want to make sure we've got the right values here. On
> one of my builds:
>
> vmlinux.bin, size 21186672, "objdump -h" shows:
>
>  25 .bss          00da2000  ffffffff82034000  0000000002034000  01434000  2**12
>                   ALLOC
>  26 .brk          00026000  ffffffff82dd6000  0000000002dd6000  01434000  2**0
>                   ALLOC
>
> vmlinux.relocs, size 1058872
>
> vmlinux.bin.gz, size 7549715, shows a decompressed size of 22245544:
> $ dd if=vmlinux.bin.gz bs=1 skip=7549711 2>/dev/null | hexdump -C
> 00000000  a8 70 53 01                                       |.pS.|
> 00000004
>
> 0x015370a8 == 22245544 == 21186672 + 1058872
>
> This means relocs are overlapping .bss, since .bss starts at
> 0x01434000 (21184512). The actually needed reserved space is
> 0x01434000 + 0x00da2000 + 0x00026000 (35635200), rather than 22245544
> + 0x00da2000 + 0x00026000 (36696232).
>
> So, since z_output_len is really "end of relocs table" and not "end of
> kernel", we actually need two offsets:
>
> - relocs table end (exists already via z_output_len)
> - end of actual kernel plus .bss and .brk, which is the sum of:
>   - actual size of decompressed kernel image (vmlinux.bin: 21186672)
>   - size of .bss + .brk (0x00da2000 + 0x00026000)
>
> And kASLR needs to use the larger of these two values.
>
> -Kees
>
> --
> Kees Cook
> Chrome OS Security



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ