[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.00.1801051909160.27010@gjva.wvxbf.pm>
Date: Fri, 5 Jan 2018 19:19:51 +0100 (CET)
From: Jiri Kosina <jikos@...nel.org>
To: Dave Hansen <dave.hansen@...ux.intel.com>
cc: Yisheng Xie <xieyisheng1@...wei.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, richard.fellner@...dent.tugraz.at,
moritz.lipp@...k.tugraz.at, daniel.gruss@...k.tugraz.at,
michael.schwarz@...k.tugraz.at, luto@...nel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
keescook@...gle.com, hughd@...gle.com, x86@...nel.org,
Andrea Arcangeli <aarcange@...hat.com>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page
tables (core patch)
[ adding Hugh ]
On Thu, 4 Jan 2018, Dave Hansen wrote:
> > BTW, we have just reported a bug caused by kaiser[1], which looks like
> > caused by SMEP. Could you please help to have a look?
> >
> > [1] https://lkml.org/lkml/2018/1/5/3
>
> Please report that to your kernel vendor. Your EFI page tables have the
> NX bit set on the low addresses. There have been a bunch of iterations
> of this, but you need to make sure that the EFI kernel mappings don't
> get _PAGE_NX set on them. Look at what __pti_set_user_pgd() does in
> mainline.
Unfortunately this is more complicated.
The thing is -- efi=old_memmap is broken even upstream. We will probably
not receive too many reports about this against upstream PTI, as most of
the machines are using classic high-mapping of EFI regions; but older
kernels force on certain machines stil old_memmap (or it can be specified
manually on kernel cmdline), where EFI has all its mapping in the
userspace range.
And that explodes, as those get marked NX in the kernel pagetables.
I've spent most of today tracking this down (the legacy EFI mmap is
horrid); the patch below is confirmed to fix it both on current upstream
kernel, as well as on original-KAISER based kernels (Hugh's backport) in
cases old_memmap is used by EFI.
I am not super happy about this, but I din't really want to extend the
_set_pgd() code to always figure out whether it's dealing wih low EFI
mapping or not, as that would be way too much overhead just for this
one-off call during boot.
From: Jiri Kosina <jkosina@...e.cz>
Subject: [PATCH] PTI: unbreak EFI old_memmap
old_memmap's efi_call_phys_prolog() calls set_pgd() with swapper PGD that
has PAGE_USER set, which makes PTI set NX on it, and therefore EFI can't
execute it's code.
Fix that by forcefully clearing _PAGE_NX from the PGD (this can't be done
by the pgprot API).
_PAGE_NX will be automatically reintroduced in efi_call_phys_epilog(), as
_set_pgd() will again notice that this is _PAGE_USER, and set _PAGE_NX on
it.
Signed-off-by: Jiri Kosina <jkosina@...e.cz>
---
arch/x86/platform/efi/efi_64.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -95,6 +95,12 @@ pgd_t * __init efi_call_phys_prolog(void
save_pgd[pgd] = *pgd_offset_k(pgd * PGDIR_SIZE);
vaddress = (unsigned long)__va(pgd * PGDIR_SIZE);
set_pgd(pgd_offset_k(pgd * PGDIR_SIZE), *pgd_offset_k(vaddress));
+ /*
+ * pgprot API doesn't clear it for PGD
+ *
+ * Will be brought back automatically in _epilog()
+ */
+ pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX;
}
__flush_tlb_all();
--
Jiri Kosina
SUSE Labs
Powered by blists - more mailing lists