[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e8f46f6-35a7-e38d-0197-fb86b40fde1a@redhat.com>
Date: Wed, 10 Jan 2018 12:27:09 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Greg KH <gregkh@...uxfoundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andi Kleen <ak@...ux.intel.com>,
Arjan Van De Ven <arjan.van.de.ven@...el.com>,
David Woodhouse <dwmw@...zon.co.uk>,
Dan Williams <dan.j.williams@...el.com>,
Ashok Raj <ashok.raj@...el.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 3/5] x86/enter: Use IBRS on syscall and interrupts
On 10/01/2018 11:04, Peter Zijlstra wrote:
> On Tue, Jan 09, 2018 at 06:26:47PM -0800, Tim Chen wrote:
>> Set IBRS upon kernel entrance via syscall and interrupts. Clear it
>> upon exit. IBRS protects against unsafe indirect branching predictions
>> in the kernel.
>>
>> The NMI interrupt save/restore of IBRS state was based on Andrea
>> Arcangeli's implementation.
>> Here's an explanation by Dave Hansen on why we save IBRS state for NMI.
>>
>> The normal interrupt code uses the 'error_entry' path which uses the
>> Code Segment (CS) of the instruction that was interrupted to tell
>> whether it interrupted the kernel or userspace and thus has to switch
>> IBRS, or leave it alone.
>>
>> The NMI code is different. It uses 'paranoid_entry' because it can
>> interrupt the kernel while it is running with a userspace IBRS (and %GS
>> and CR3) value, but has a kernel CS. If we used the same approach as
>> the normal interrupt code, we might do the following;
>>
>> SYSENTER_entry
>> <-------------- NMI HERE
>> IBRS=1
>> do_something()
>> IBRS=0
>> SYSRET
>>
>> The NMI code might notice that we are running in the kernel and decide
>> that it is OK to skip the IBRS=1. This would leave it running
>> unprotected with IBRS=0, which is bad.
>>
>> However, if we unconditionally set IBRS=1, in the NMI, we might get the
>> following case:
>>
>> SYSENTER_entry
>> IBRS=1
>> do_something()
>> IBRS=0
>> <-------------- NMI HERE (set IBRS=1)
>> SYSRET
>>
>> and we would return to userspace with IBRS=1. Userspace would run
>> slowly until we entered and exited the kernel again.
>>
>> Instead of those two approaches, we chose a third one where we simply
>> save the IBRS value in a scratch register (%r13) and then restore that
>> value, verbatim.
>>
>
> What this Changelog fails to address is _WHY_ we need this. What does
> this provide that retpoline does not.
Which, for the record, is just that it works better on Skylake+ CPUs.
Paolo
Powered by blists - more mailing lists