linux-kernel - Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXgabhubUXMiTP5AgSASMkkzG+bYFaSoPt52QBZLm-PVg@mail.gmail.com>
Date:   Fri, 6 Jan 2017 15:39:13 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Thomas Garnier <thgarnie@...gle.com>
Cc:     Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Kees Cook <keescook@...omium.org>,
        Borislav Petkov <bp@...en8.de>, Dave Hansen <dave@...1.net>,
        Chen Yucong <slaoub@...il.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Anna-Maria Gleixner <anna-maria@...utronix.de>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Michael Ellerman <mpe@...erman.id.au>,
        Juergen Gross <jgross@...e.com>,
        Richard Weinberger <richard@....at>, X86 ML <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kernel-hardening@...ts.openwall.com" 
        <kernel-hardening@...ts.openwall.com>
Subject: Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

On Fri, Jan 6, 2017 at 2:54 PM, Thomas Garnier <thgarnie@...gle.com> wrote:
> On Fri, Jan 6, 2017 at 1:59 PM, Andy Lutomirski <luto@...nel.org> wrote:
>> On Fri, Jan 6, 2017 at 10:03 AM, Thomas Garnier <thgarnie@...gle.com> wrote:
>>> On Thu, Jan 5, 2017 at 10:49 PM, Ingo Molnar <mingo@...nel.org> wrote:
>>>>
>>>> * Thomas Garnier <thgarnie@...gle.com> wrote:
>>>>
>>>>> >> Not sure I fully understood and I don't want to miss an important point. Do
>>>>> >> you mean making GDT (remapping and per-cpu) read-only and switch the
>>>>> >> writeable flag only when we write to the per-cpu entry?
>>>>> >
>>>>> > What I mean is: write to the GDT through normal percpu access (or whatever the
>>>>> > normal mapping is) but load a read-only alias into the GDT register.  As long
>>>>> > as nothing ever tries to write through the GDTR alias, no page faults will be
>>>>> > generated.  So we just need to make sure that nothing ever writes to it
>>>>> > through GDTR.  AFAIK the only reason the CPU ever writes to the address in
>>>>> > GDTR is to set an accessed bit.
>>>>>
>>>>> A write is made when we use load_TR_desc (ltr). I didn't see any other yet.
>>>>
>>>> Is this write to the GDT, generated by the LTR instruction, done unconditionally
>>>> by the hardware?
>>>>
>>>
>>> That was my experience. I didn't look into details. Do you think we
>>> could change something so that ltr never writes to the GDT? (just mark
>>> the TSS entry busy).
>>
>> No, and I had the way this worked on 64-bit wrong.  LTR requires an
>> available TSS and changes it to busy.  So here are my thoughts on how
>> this should work:
>>
>> Let's get rid of any connection between this code and KASLR.  Every
>> time KASLR makes something work differently, a kitten turns all
>> Schrödinger on us.  This is moving the GDT to the fixmap, plain and
>> simple.  For now, make it one page per CPU and don't worry about the
>> GDT limit.
>
> I am all for this change but that's more significant.
>
> Ingo: What do you think about that?
>
>>
>> On 32-bit, we're going to have to make the fixmap GDT be read-write
>> because making it read-only will break double-fault handling.
>>
>> On 64-bit, we can use your trick of temporarily mapping the GDT
>> read-write every time we load TR, which should happen very rarely.
>> Alternatively, we can reload the *GDT* every time we reload TR, which
>> should be comparably slow.  This is going to regress performance in
>> the extremely rare case where KVM exits to a process that uses
>> ioperm() (I think), but I doubt anyone cares.  Or maybe we could
>> arrange to never reload TR when GDT points at the fixmap by having KVM
>> set the host GDT to the direct version and letting KVM's code to
>> reload the GDT switch to the fixmap copy.
>>
>> If we need a quirk to keep the fixmap copy read-write, so be it.
>>
>> None of this should depend on KASLR.  IMO it should happen unconditionally.
>>
>
> I looked back at the fixmap, and I can see a way it could be done
> (using NR_CPUS) like the other fixmap ranges. It would limit the
> number of cpus to 512 (there is 2M memory left on fixmap on the
> default configuration). That's if we never add any other fixmap on
> x64. I don't know if it is an acceptable number and if the fixmap
> region could be increased. (128 if we do your kvm trick, of course).
>

IIRC we need 4096 CPUs.  But that 2M limit seems eminently fixable.  I
just tried sticking 4096 pages of nothing right near the top of the
fixmap and the only problem I saw was that I had to move MODULES_END
down a little bit.

--Andy

P.S. Let's do the move to the fixmap, read/write as a separate patch.
That will make bisecting much easier.