[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180713105620.z6bjhqzfez2hll6r@8bytes.org>
Date: Fri, 13 Jul 2018 12:56:20 +0200
From: Joerg Roedel <joro@...tes.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
"H . Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...el.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>, Jiri Kosina <jkosina@...e.cz>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Brian Gerst <brgerst@...il.com>,
David Laight <David.Laight@...lab.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Eduardo Valentin <eduval@...zon.com>,
Greg KH <gregkh@...uxfoundation.org>,
Will Deacon <will.deacon@....com>, aliguori@...zon.com,
daniel.gruss@...k.tugraz.at, hughd@...gle.com, keescook@...gle.com,
Andrea Arcangeli <aarcange@...hat.com>,
Waiman Long <llong@...hat.com>, Pavel Machek <pavel@....cz>,
"David H . Gutteridge" <dhgutteridge@...patico.ca>, jroedel@...e.de
Subject: Re: [PATCH 07/39] x86/entry/32: Enter the kernel via trampoline stack
Hi Andy,
thanks for you valuable feedback.
On Thu, Jul 12, 2018 at 02:09:45PM -0700, Andy Lutomirski wrote:
> > On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@...tes.org> wrote:
> > -.macro SAVE_ALL pt_regs_ax=%eax
> > +.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0
> > cld
> > + /* Push segment registers and %eax */
> > PUSH_GS
> > pushl %fs
> > pushl %es
> > pushl %ds
> > pushl \pt_regs_ax
> > +
> > + /* Load kernel segments */
> > + movl $(__USER_DS), %eax
>
> If \pt_regs_ax != %eax, then this will behave oddly. Maybe it’s okay.
> But I don’t see why this change was needed at all.
This is a left-over from a previous approach I tried and then abandoned
later. You are right, it is not needed.
> > +/*
> > + * Called with pt_regs fully populated and kernel segments loaded,
> > + * so we can access PER_CPU and use the integer registers.
> > + *
> > + * We need to be very careful here with the %esp switch, because an NMI
> > + * can happen everywhere. If the NMI handler finds itself on the
> > + * entry-stack, it will overwrite the task-stack and everything we
> > + * copied there. So allocate the stack-frame on the task-stack and
> > + * switch to it before we do any copying.
>
> Ick, right. Same with machine check, though. You could alternatively
> fix it by running NMIs on an irq stack if the irq count is zero. How
> confident are you that you got #MC right?
Pretty confident, #MC uses the exception entry path which also handles
entry-stack and user-cr3 correctly. It might go through through the slow
paranoid exit path, but that's okay for #MC I guess.
And when the #MC happens while we switch to the task stack and do the
copying the same precautions as for NMI apply.
> > + */
> > +.macro SWITCH_TO_KERNEL_STACK
> > +
> > + ALTERNATIVE "", "jmp .Lend_\@", X86_FEATURE_XENPV
> > +
> > + /* Are we on the entry stack? Bail out if not! */
> > + movl PER_CPU_VAR(cpu_entry_area), %edi
> > + addl $CPU_ENTRY_AREA_entry_stack, %edi
> > + cmpl %esp, %edi
> > + jae .Lend_\@
>
> That’s an alarming assumption about the address space layout. How
> about an xor and an and instead of cmpl? As it stands, if the address
> layout ever changes, the failure may be rather subtle.
Right, I implement a more restrictive check.
> Anyway, wouldn’t it be easier to solve this by just not switching
> stacks on entries from kernel mode and making the entry stack bigger?
> Stick an assertion in the scheduling code that we’re not on an entry
> stack, perhaps.
That'll save us the check whether we are on the entry stack and replace
it with a check whether we are coming from user/vm86 mode. I don't think
that this will simplify things much and I am a bit afraid that it'll
break unwritten assumptions elsewhere. It is probably something we can
look into later separatly from the basic pti-x32 enablement.
Thanks,
Joerg
Powered by blists - more mailing lists