[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9FF4C41B-AB54-4FD6-9D0F-972CB7C39E42@vmware.com>
Date: Thu, 15 Feb 2018 17:36:12 +0000
From: Nadav Amit <namit@...are.com>
To: Dave Hansen <dave.hansen@...ux.intel.com>
CC: Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Andy Lutomirski" <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
"Willy Tarreau" <w@....eu>, "x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v2 5/6] x86: Use global pages when PTI is disabled
Dave Hansen <dave.hansen@...ux.intel.com> wrote:
> On 02/15/2018 08:36 AM, Nadav Amit wrote:
>> As long as PTI is disabled, it is possible to use global pages, as long
>> as we remove them once PTI is enabled again. To do so, return the global
>> bit to __supported_pte_mask and disable global pages using CR4.
>>
>> Signed-off-by: Nadav Amit <namit@...are.com>
>> ---
>> arch/x86/include/asm/tlbflush.h | 6 ++++++
>> arch/x86/mm/init.c | 14 ++++++--------
>> arch/x86/mm/tlb.c | 3 ++-
>> 3 files changed, 14 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
>> index ea65cf951c49..3a44cb0a9f56 100644
>> --- a/arch/x86/include/asm/tlbflush.h
>> +++ b/arch/x86/include/asm/tlbflush.h
>> @@ -319,6 +319,12 @@ static inline void set_cpu_pti_disable(unsigned short disable)
>> WARN_ON_ONCE(preemptible());
>>
>> pti_update_user_cs64(cpu_pti_disable(), disable);
>> + if (__supported_pte_mask & _PAGE_GLOBAL) {
>> + if (disable)
>> + cr4_set_bits(X86_CR4_PGE);
>> + else
>> + cr4_clear_bits(X86_CR4_PGE);
>> + }
>> this_cpu_write(cpu_tlbstate.pti_disable, disable);
>> }
>
> The TLB invalidations when doing this switch are *CRITICAL*. Otherwise,
> we end up globally-mapped kernel entries persisting to other processes
> that are then vulnerable to Meltdown.
>
> So, where are the TLB flushes?
>
> They're hidden in the cr4_set/clear_bits() function, of course. This is
> dangerous for two reasons because it makes them non-obvious and hard to
> find. It also has no interactions with the existing TLB invalidation
> infrastructure. That's _safe_ of course because extra flushing is OK,
> but it feels really funky because you're going to end up double-flushing
> on context switches which is rather unfortunate.
>
> This also needs some heavy commenting about the fact that _PAGE_GLOBAL
> is ignored when CR4.PGE=0. That's key to this working and not mentioned
> anywhere.
>
> While this looks OK to me, it still makes me rather nervous. The
> changelog and commenting definitely need a lot of work. I'm also still
> rather unconvinced that the added complexity here is worth it.
I agree that comments, and perhaps some wrapper functions to clarify when a
flush takes place are necessary. I actually sent the patches before making
another pass since I saw they somewhat conflict with your recent patches.
I think that there are several reasons why these patches are not as bad as
they look:
1) Legacy mode performance (with PTI) is bad, but apparently x86-32
applications are still widely used: https://lkml.org/lkml/2018/2/10/38
2) It is likely that at some point you will want to disable PTI selectively
for processes, similarly to the way Windows “trusts” Microsoft SQL. At this
point you are likely to end up with most of the code that these patches
introduce.
Powered by blists - more mailing lists