linux-kernel - [tip:x86/urgent] x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 5 May 2017 01:11:16 -0700
From:   tip-bot for Baoquan He <tipbot@...or.com>
To:     linux-tip-commits@...r.kernel.org
Cc:     peterz@...radead.org, tglx@...utronix.de,
        linux-kernel@...r.kernel.org, dan.j.williams@...el.com,
        dyoung@...hat.com, yasu.isimatu@...il.com, mingo@...nel.org,
        hpa@...or.com, jinb.park7@...il.com, jpoimboe@...hat.com,
        thgarnie@...gle.com, dvlasenk@...hat.com, bhe@...hat.com,
        kirill.shutemov@...ux.intel.com, dave.hansen@...ux.intel.com,
        akpm@...ux-foundation.org, luto@...nel.org, yinghai@...nel.org,
        jmoyer@...hat.com, brgerst@...il.com, keescook@...omium.org,
        torvalds@...ux-foundation.org, bp@...en8.de
Subject: [tip:x86/urgent] x86/mm: Fix boot crash caused by incorrect loop
 count calculation in sync_global_pgds()

Commit-ID:  fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7
Gitweb:     http://git.kernel.org/tip/fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7
Author:     Baoquan He <bhe@...hat.com>
AuthorDate: Thu, 4 May 2017 10:25:47 +0800
Committer:  Ingo Molnar <mingo@...nel.org>
CommitDate: Fri, 5 May 2017 08:21:24 +0200

x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()

Jeff Moyer reported that on his system with two memory regions 0~64G and
1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR
will make the system hang intermittently during boot. While adding 'nokaslr'
won't.

The back trace is:

 Oops: 0000 [#1] SMP

 RIP: memcpy_erms()
 [ .... ]
 Call Trace:
  pmem_rw_page()
  bdev_read_page()
  do_mpage_readpage()
  mpage_readpages()
  blkdev_readpages()
  __do_page_cache_readahead()
  force_page_cache_readahead()
  page_cache_sync_readahead()
  generic_file_read_iter()
  blkdev_read_iter()
  __vfs_read()
  vfs_read()
  SyS_read()
  entry_SYSCALL_64_fastpath()

This crash happens because the for loop count calculation in sync_global_pgds()
is not correct. When a mapping area crosses PGD entries, we should
calculate the starting address of region which next PGD covers and assign
it to next for loop count, but not add PGDIR_SIZE directly. The old
code works right only if the mapping area is an exact multiple of PGDIR_SIZE,
otherwize the end region could be skipped so that it can't be synchronized
to all other processes from kernel PGD init_mm.pgd.

In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
makes this area be mapped inside one PGD entry. With KASLR enabled,
this area could cross two PGD entries, then the next PGD entry won't
be synced to all other processes. That is why we saw empty PGD.

Fix it.

Reported-by: Jeff Moyer <jmoyer@...hat.com>
Signed-off-by: Baoquan He <bhe@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Brian Gerst <brgerst@...il.com>
Cc: Dan Williams <dan.j.williams@...el.com>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Dave Young <dyoung@...hat.com>
Cc: Denys Vlasenko <dvlasenk@...hat.com>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Jinbum Park <jinb.park7@...il.com>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Kees Cook <keescook@...omium.org>
Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Garnier <thgarnie@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Yasuaki Ishimatsu <yasu.isimatu@...il.com>
Cc: Yinghai Lu <yinghai@...nel.org>
Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
 arch/x86/mm/init_64.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 745e5e1..97fe887 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -94,10 +94,10 @@ __setup("noexec32=", nonx32_setup);
  */
 void sync_global_pgds(unsigned long start, unsigned long end)
 {
-	unsigned long address;
+	unsigned long addr;
 
-	for (address = start; address <= end; address += PGDIR_SIZE) {
-		pgd_t *pgd_ref = pgd_offset_k(address);
+	for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) {
+		pgd_t *pgd_ref = pgd_offset_k(addr);
 		const p4d_t *p4d_ref;
 		struct page *page;
 
@@ -106,7 +106,7 @@ void sync_global_pgds(unsigned long start, unsigned long end)
 		 * handle synchonization on p4d level.
 		 */
 		BUILD_BUG_ON(pgd_none(*pgd_ref));
-		p4d_ref = p4d_offset(pgd_ref, address);
+		p4d_ref = p4d_offset(pgd_ref, addr);
 
 		if (p4d_none(*p4d_ref))
 			continue;
@@ -117,8 +117,8 @@ void sync_global_pgds(unsigned long start, unsigned long end)
 			p4d_t *p4d;
 			spinlock_t *pgt_lock;
 
-			pgd = (pgd_t *)page_address(page) + pgd_index(address);
-			p4d = p4d_offset(pgd, address);
+			pgd = (pgd_t *)page_address(page) + pgd_index(addr);
+			p4d = p4d_offset(pgd, addr);
 			/* the pgt_lock only for Xen */
 			pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
 			spin_lock(pgt_lock);