[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVg=XQh+9VczkoC-0oLnBHGD=5hswTmyWQUR8_TTpnDsQ@mail.gmail.com>
Date: Thu, 4 Jan 2018 08:17:06 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Andy Lutomirski <luto@...nel.org>,
Benjamin Gilbert <benjamin.gilbert@...eos.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
X86 ML <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-mm@...ck.org, stable <stable@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Garnier <thgarnie@...gle.com>,
Alexander Kuleshov <kuleshovmail@...il.com>
Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> <benjamin.gilbert@...eos.com> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called). The problem may be obvious.
>> >
>> > current_kernel attached. I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong. KASLR sets
>> vaddr_end to a totally bogus value. It should be no larger than
>> LDT_BASE_ADDR. I suspect that your vmemmap is getting randomized into
>> the LDT range. If it weren't for that, it could just as easily land
>> in the cpu_entry_area range. This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite. This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
> tglx
>
> 8<--------------
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
> ... unused hole ...
> ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
> ... unused hole ...
> -fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> + vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
> ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
> ... unused hole ...
> ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
> ... unused hole ...
> ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
> ... unused hole ...
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> + vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +... unused hole ...
> ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
> ... unused hole ...
> ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
> # define VMALLOC_SIZE_TB _AC(32, UL)
> # define __VMALLOC_BASE _AC(0xffffc90000000000, UL)
> # define __VMEMMAP_BASE _AC(0xffffea0000000000, UL)
> -# define LDT_PGD_ENTRY _AC(-4, UL)
> +# define LDT_PGD_ENTRY _AC(-3, UL)
> # define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
> #endif
If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too. I have a draft patch to
just sort the damn list, but that's not ready yet.
Powered by blists - more mailing lists