[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171221162631.y57r4f5nbeu35llz@node.shutemov.name>
Date: Thu, 21 Dec 2017 19:26:31 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirsky <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Dave Hansen <dave.hansen@...el.com>,
Borislav Petkov <bpetkov@...e.de>,
Greg KH <gregkh@...uxfoundation.org>, keescook@...gle.com,
hughd@...gle.com, Brian Gerst <brgerst@...il.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Rik van Riel <riel@...hat.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Juergen Gross <jgross@...e.com>,
David Laight <David.Laight@...lab.com>,
Eduardo Valentin <eduval@...zon.com>, aliguori@...zon.com,
Will Deacon <will.deacon@....com>,
Vlastimil Babka <vbabka@...e.cz>, daniel.gruss@...k.tugraz.at
Subject: Re: [patch V181 00/54] x86/pti: Final XMAS release
On Thu, Dec 21, 2017 at 03:57:02PM +0300, Kirill A. Shutemov wrote:
> On Wed, Dec 20, 2017 at 10:35:03PM +0100, Thomas Gleixner wrote:
> > The series is also available from git:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/pti
>
> The patchset looks sane in 5-level paging configuration as long as commit
> c739f930be1d ("x86/espfix/64: Fix espfix double-fault handling on 5-level
> systems") from tip/x86/urgent is applied.
>
> Tested-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Failed to boot with EFI. I don't think it's limited to 5-level paging.
The fix is below.
BUG: unable to handle kernel paging request at ff1000017d803000
IP: __pti_set_user_pgd+0x22/0x44
PGD 4be4067 P4D 4be5067 PUD 27f905067 PMD 27f718067 PTE 800000017d803060
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-00174-g9b1270951308 #6613
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
task: ffffffff822114c0 task.stack: ffffffff82200000
RIP: 0010:__pti_set_user_pgd+0x22/0x44
RSP: 0000:ffffffff82203c70 EFLAGS: 00000202
RAX: 000000007c490063 RBX: ffffffff82203e28 RCX: 0000000000000002
RDX: 0000000000000001 RSI: 000000007c490063 RDI: ff1000017d803000
RBP: 000000007ff58000 R08: 0000000000000000 R09: 0000000000000067
R10: ffffffff82203a78 R11: 0000000000000001 R12: ff1000017d802000
R13: 0000000000000000 R14: 000000007ff58000 R15: ffffffff82203e28
FS: 0000000000000000(0000) GS:ff1000007f000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ff1000017d803000 CR3: 000000000440a000 CR4: 00000000000016b0
Call Trace:
__cpa_process_fault+0x3f6/0x5d0
? get_page_from_freelist+0x34f/0xbd0
__change_page_attr_set_clr+0x820/0xd70
? __alloc_pages_nodemask+0x124/0xfd0
? __alloc_pages_nodemask+0x124/0xfd0
? printk+0x3e/0x46
? kernel_map_pages_in_pgd+0x91/0x180
kernel_map_pages_in_pgd+0x91/0x180
? __map_region+0x37/0x53
__map_region+0x37/0x53
efi_map_region+0x27/0xb3
efi_enter_virtual_mode+0x26f/0x490
start_kernel+0x368/0x3df
secondary_startup_64+0xab/0xb0
Code: 90 90 90 90 90 90 90 90 90 48 89 fa 48 89 f0 81 e2 ff 0f 00 00 48 81 fa ff 07 00 00 77 16 48 89 f2 48 81 cf 00 10 00 00 83 e2 05 <48> 89 37 48 83 fa 05 74 01 c3 48 83 3d 6c 7c 29 01 00 79 f5 48
RIP: __pti_set_user_pgd+0x22/0x44 RSP: ffffffff82203c70
CR2: ff1000017d803000
-----------8<-----------
>From 373d3c99f9a8a70f19363c6f007e78216a5f935e Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Date: Thu, 21 Dec 2017 19:11:54 +0300
Subject: [PATCH] x86/efi: allocate two PGD for EFI page tables if PTI is
enabled
EFI has its own top-level page table to avoid inserting EFI region
mappings into standard kernel page tables.
For PTI, we need to allocate two PGD page tables instead of one.
The user side is never used, but this allows us to use normal helpers to
deal with EFI page tables.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
---
arch/x86/include/asm/pgalloc.h | 11 +++++++++++
arch/x86/mm/pgtable.c | 11 -----------
arch/x86/platform/efi/efi_64.c | 5 ++++-
3 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 4b5e1eafada7..aff42e1da6ee 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -30,6 +30,17 @@ static inline void paravirt_release_p4d(unsigned long pfn) {}
*/
extern gfp_t __userpte_alloc_gfp;
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * Instead of one PGD, we acquire two PGDs. Being order-1, it is
+ * both 8k in size and 8k-aligned. That lets us just flip bit 12
+ * in a pointer to swap between the two 4k halves.
+ */
+#define PGD_ALLOCATION_ORDER 1
+#else
+#define PGD_ALLOCATION_ORDER 0
+#endif
+
/*
* Allocate and free page tables.
*/
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c05b6dccc72d..9b7bcbd33cc2 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -356,17 +356,6 @@ static inline void _pgd_free(pgd_t *pgd)
}
#else
-#ifdef CONFIG_PAGE_TABLE_ISOLATION
-/*
- * Instead of one PGD, we acquire two PGDs. Being order-1, it is
- * both 8k in size and 8k-aligned. That lets us just flip bit 12
- * in a pointer to swap between the two 4k halves.
- */
-#define PGD_ALLOCATION_ORDER 1
-#else
-#define PGD_ALLOCATION_ORDER 0
-#endif
-
static inline pgd_t *_pgd_alloc(void)
{
return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 20fb31579b69..39c4b35ac7a4 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -195,6 +195,9 @@ static pgd_t *efi_pgd;
* because we want to avoid inserting EFI region mappings (EFI_VA_END
* to EFI_VA_START) into the standard kernel page tables. Everything
* else can be shared, see efi_sync_low_kernel_mappings().
+ *
+ * We don't want the pgd on the pgd_list and cannot use pgd_alloc() for the
+ * allocation.
*/
int __init efi_alloc_page_tables(void)
{
@@ -207,7 +210,7 @@ int __init efi_alloc_page_tables(void)
return 0;
gfp_mask = GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO;
- efi_pgd = (pgd_t *)__get_free_page(gfp_mask);
+ efi_pgd = (pgd_t *)__get_free_pages(gfp_mask, PGD_ALLOCATION_ORDER);
if (!efi_pgd)
return -ENOMEM;
--
Kirill A. Shutemov
Powered by blists - more mailing lists