lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 21 Dec 2017 19:26:31 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andy Lutomirsky <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Borislav Petkov <bpetkov@...e.de>,
        Greg KH <gregkh@...uxfoundation.org>, keescook@...gle.com,
        hughd@...gle.com, Brian Gerst <brgerst@...il.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        Rik van Riel <riel@...hat.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        David Laight <David.Laight@...lab.com>,
        Eduardo Valentin <eduval@...zon.com>, aliguori@...zon.com,
        Will Deacon <will.deacon@....com>,
        Vlastimil Babka <vbabka@...e.cz>, daniel.gruss@...k.tugraz.at
Subject: Re: [patch V181 00/54] x86/pti: Final XMAS release

On Thu, Dec 21, 2017 at 03:57:02PM +0300, Kirill A. Shutemov wrote:
> On Wed, Dec 20, 2017 at 10:35:03PM +0100, Thomas Gleixner wrote:
> > The series is also available from git:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/pti
> 
> The patchset looks sane in 5-level paging configuration as long as commit
> c739f930be1d ("x86/espfix/64: Fix espfix double-fault handling on 5-level
> systems") from tip/x86/urgent is applied.
> 
> Tested-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>

Failed to boot with EFI. I don't think it's limited to 5-level paging.

The fix is below.

	BUG: unable to handle kernel paging request at ff1000017d803000
	IP: __pti_set_user_pgd+0x22/0x44
	PGD 4be4067 P4D 4be5067 PUD 27f905067 PMD 27f718067 PTE 800000017d803060
	Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-00174-g9b1270951308 #6613
	Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
	task: ffffffff822114c0 task.stack: ffffffff82200000
	RIP: 0010:__pti_set_user_pgd+0x22/0x44
	RSP: 0000:ffffffff82203c70 EFLAGS: 00000202
	RAX: 000000007c490063 RBX: ffffffff82203e28 RCX: 0000000000000002
	RDX: 0000000000000001 RSI: 000000007c490063 RDI: ff1000017d803000
	RBP: 000000007ff58000 R08: 0000000000000000 R09: 0000000000000067
	R10: ffffffff82203a78 R11: 0000000000000001 R12: ff1000017d802000
	R13: 0000000000000000 R14: 000000007ff58000 R15: ffffffff82203e28
	FS:  0000000000000000(0000) GS:ff1000007f000000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: ff1000017d803000 CR3: 000000000440a000 CR4: 00000000000016b0
	Call Trace:
	 __cpa_process_fault+0x3f6/0x5d0
	 ? get_page_from_freelist+0x34f/0xbd0
	 __change_page_attr_set_clr+0x820/0xd70
	 ? __alloc_pages_nodemask+0x124/0xfd0
	 ? __alloc_pages_nodemask+0x124/0xfd0
	 ? printk+0x3e/0x46
	 ? kernel_map_pages_in_pgd+0x91/0x180
	 kernel_map_pages_in_pgd+0x91/0x180
	 ? __map_region+0x37/0x53
	 __map_region+0x37/0x53
	 efi_map_region+0x27/0xb3
	 efi_enter_virtual_mode+0x26f/0x490
	 start_kernel+0x368/0x3df
	 secondary_startup_64+0xab/0xb0
	Code: 90 90 90 90 90 90 90 90 90 48 89 fa 48 89 f0 81 e2 ff 0f 00 00 48 81 fa ff 07 00 00 77 16 48 89 f2 48 81 cf 00 10 00 00 83 e2 05 <48> 89 37 48 83 fa 05 74 01 c3 48 83 3d 6c 7c 29 01 00 79 f5 48
	RIP: __pti_set_user_pgd+0x22/0x44 RSP: ffffffff82203c70
	CR2: ff1000017d803000

-----------8<-----------

>From 373d3c99f9a8a70f19363c6f007e78216a5f935e Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Date: Thu, 21 Dec 2017 19:11:54 +0300
Subject: [PATCH] x86/efi: allocate two PGD for EFI page tables if PTI is
 enabled

EFI has its own top-level page table to avoid inserting EFI region
mappings into standard kernel page tables.

For PTI, we need to allocate two PGD page tables instead of one.
The user side is never used, but this allows us to use normal helpers to
deal with EFI page tables.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
---
 arch/x86/include/asm/pgalloc.h | 11 +++++++++++
 arch/x86/mm/pgtable.c          | 11 -----------
 arch/x86/platform/efi/efi_64.c |  5 ++++-
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 4b5e1eafada7..aff42e1da6ee 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -30,6 +30,17 @@ static inline void paravirt_release_p4d(unsigned long pfn) {}
  */
 extern gfp_t __userpte_alloc_gfp;
 
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * Instead of one PGD, we acquire two PGDs.  Being order-1, it is
+ * both 8k in size and 8k-aligned.  That lets us just flip bit 12
+ * in a pointer to swap between the two 4k halves.
+ */
+#define PGD_ALLOCATION_ORDER 1
+#else
+#define PGD_ALLOCATION_ORDER 0
+#endif
+
 /*
  * Allocate and free page tables.
  */
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c05b6dccc72d..9b7bcbd33cc2 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -356,17 +356,6 @@ static inline void _pgd_free(pgd_t *pgd)
 }
 #else
 
-#ifdef CONFIG_PAGE_TABLE_ISOLATION
-/*
- * Instead of one PGD, we acquire two PGDs.  Being order-1, it is
- * both 8k in size and 8k-aligned.  That lets us just flip bit 12
- * in a pointer to swap between the two 4k halves.
- */
-#define PGD_ALLOCATION_ORDER 1
-#else
-#define PGD_ALLOCATION_ORDER 0
-#endif
-
 static inline pgd_t *_pgd_alloc(void)
 {
 	return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 20fb31579b69..39c4b35ac7a4 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -195,6 +195,9 @@ static pgd_t *efi_pgd;
  * because we want to avoid inserting EFI region mappings (EFI_VA_END
  * to EFI_VA_START) into the standard kernel page tables. Everything
  * else can be shared, see efi_sync_low_kernel_mappings().
+ *
+ * We don't want the pgd on the pgd_list and cannot use pgd_alloc() for the
+ * allocation.
  */
 int __init efi_alloc_page_tables(void)
 {
@@ -207,7 +210,7 @@ int __init efi_alloc_page_tables(void)
 		return 0;
 
 	gfp_mask = GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO;
-	efi_pgd = (pgd_t *)__get_free_page(gfp_mask);
+	efi_pgd = (pgd_t *)__get_free_pages(gfp_mask, PGD_ALLOCATION_ORDER);
 	if (!efi_pgd)
 		return -ENOMEM;
 
-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ