linux-kernel - Re: KAISER memory layout (Re: [PATCH 06/23] x86, kaiser: introduce user-mapped percpu areas)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1711021653240.2090@nanos>
Date:   Thu, 2 Nov 2017 17:03:38 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Andy Lutomirski <luto@...capital.net>
cc:     Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        moritz.lipp@...k.tugraz.at,
        Daniel Gruss <daniel.gruss@...k.tugraz.at>,
        michael.schwarz@...k.tugraz.at,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Kees Cook <keescook@...gle.com>,
        Hugh Dickins <hughd@...gle.com>, X86 ML <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: KAISER memory layout (Re: [PATCH 06/23] x86, kaiser: introduce
 user-mapped percpu areas)

On Thu, 2 Nov 2017, Andy Lutomirski wrote:
> > On Nov 2, 2017, at 1:45 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> > Simpler is not the question. I want to avoid mapping the whole IST stacks.
> > 
> 
> OK, let's see.  We can have the IDT be different in the user tables and
> the kernel tables.  The user IDT could have IST-less entry stubs that do
> their own CR3 switch and then bounce to the IST stack.  I don't see why
> this wouldn't work aside from requiring a substantially larger entry
> stack, but I'm also not convinced it's worth the added complexity.  The
> NMI code would certainly need some careful thought to convince ourselves
> that it would still be correct.  #DF would be, um, interesting because of
> the silly ESPFIX64 thing.

> My inclination would be to deal with this later.  For the first upstream
> version, we map the IST stacks.  Later on, we have a separate user IDT
> that does whatever it needs to do.
>
> The argument to the contrary would be that Dave's CR3 code *and* my entry
> stack crap gets simpler if all the CR3 switches happen in special stubs.
>
> The argument against *that* is that this erase_kstack crap might also
> benefit from the magic stack switch.  OTOH that's the *exit* stack, which
> is totally independent.

My initial thought was: Use always IST stub stacks for entry and exit.

So the entry/exit stubs deal with the CR3 stuff and also with the extra
magic for espfix and nested NMIs, etc. Once that is done, you just flip
over to the relevant kernel internal stack and switch back to the user
visible one on return. Haven't thought that through completely, but in my
naive view it made stuff simpler.

> FWIW, I want to get rid of the #DB and #BP stacks entirely, but that does
> not deserve to block this series, I think.

Agreed.

Thanks,

	tglx