lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 07 May 2008 22:52:37 +0200
From:	Henry Nestler <henry.nestler@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86: endless page faults in mount_block_root for Linux
 2.6 - v2

Page faults in kernel address space between PAGE_OFFSET up to
VMALLOC_START should not try to access pte/pgd inside function
spurious_fault.

To fix, move vmalloc address range checks from vmalloc_fault to
do_page_fault for 32 and 64bit.

Signed-off-by: Henry Nestler <henry.nestler@...il.com>
---
32bit example, where adresss hole was faulting endless again (after the
patch from 2008-04-23):
=======
Linux version 2.6.25 (hn@...dt) (gcc version 4.2.1 (SUSE Linux)) #48
PREEMPT ...
64MB LOWMEM available.
[...]
entry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 61108k/65536k available (1482k kernel code, 0k reserved, 455k
data, 136k init, 0k highmem)
virtual kernel memory layout:
    fixmap  : 0xffffa000 - 0xfffff000   (  20 kB)
    vmalloc : 0xc4800000 - 0xffff8000   ( 951 MB)
    lowmem  : 0xc0000000 - 0xc4000000   (  64 MB)
      .init : 0xcCPA: page pool initialized 1 of 1 pages preallocated
[...]
checking if image is initramfs...it isn't (no cpio magic); looks like an
initrd
BUG: unable to handle kernel paging request at c0000e68
IP: [<c010cb84>] __change_page_attr_set_clr+0x104/0x590
*pde = 00000063 BUG: unable to handle kernel paging request at c0000000
IP: [<c010c5c9>] do_page_fault+0x639/0x730
*pde = 00000063 BUG: unable to handle kernel paging request at c0000000
IP: [<c010c5c9>] do_page_fault+0x639/0x730
*pde = 00000063 BUG: unable to handle kernel paging request at c0000000
IP: [<c010c5c9>] do_page_fault+0x639/0x730
===== ... this never ends or with a stack overflow ... ===

Shure, the "out of range address" was from buggy driver development.
But not of adresses should kill the complete system.

"__change_page_attr_set_clr" is some of the macros inside spurious_fault.

_After_ this patch, I got such normal trace back print:
========
checking if image is initramfs...it isn't (no cpio magic); looks like an
initrd
BUG: unable to handle kernel paging request at c0000e68
IP: [<c010cb74>] __change_page_attr_set_clr+0x104/0x590
*pde = 08a96063
Oops: 0000 [#1] PREEMPT
Modules linked in:

Pid: 1, comm: swapper Not tainted (2.6.25 #49)
EIP: 0060:[<c010cb74>] EFLAGS: 00010282 CPU: 0
EIP is at __change_page_attr_set_clr+0x104/0x590
EAX: c0000e68 EBX: 00000002 ECX: c030ac3c EDX: c3819edc
ESI: c4000000 EDI: c3f9a000 EBP: c3819eec ESP: c3819e80
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=c3818000 task=c38175f0 task.ti=c3818000)
<0>Stack: 00000046 00000000 00000000 c4000000 00000001 c3819efc ...
[...]
<0>Call Trace:
 [<c010d05f>] ? change_page_attr_set_clr+0x5f/0x1e0
 [<c010d1f7>] ? set_memory_rw+0x17/0x20
 [<c010b5b0>] ? free_init_pages+0x20/0xa0
 [<c015ed08>] ? fput+0x18/0x20
 [<c015bae7>] ? filp_close+0x47/0x70
 [<c010b641>] ? free_initrd_mem+0x11/0x20
 [<c02edf5c>] ? free_initrd+0x1c/0x40
 [<c02ee03b>] ? populate_rootfs+0xbb/0x100
 [<c02e8793>] ? kernel_init+0x83/0x260
[...]
========

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fd7e179..59f612c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -518,10 +518,6 @@ static int vmalloc_fault(unsigned long address)
 	pmd_t *pmd, *pmd_ref;
 	pte_t *pte, *pte_ref;

-	/* Make sure we are in vmalloc area */
-	if (!(address >= VMALLOC_START && address < VMALLOC_END))
-		return -1;
-
 	/* Copy kernel mappings over when needed. This can also
 	   happen within a race in page table update. In the later
 	   case just flush. */
@@ -620,13 +616,17 @@ void __kprobes do_page_fault(struct pt_regs *regs,
unsigned long error_code)
 #else
 	if (unlikely(address >= TASK_SIZE64)) {
 #endif
-		if (!(error_code & (PF_RSVD|PF_USER|PF_PROT)) &&
-		    vmalloc_fault(address) >= 0)
-			return;
+		/* Make sure we are in vmalloc area */
+		if (address >= VMALLOC_START && address < VMALLOC_END) {

-		/* Can handle a stale RO->RW TLB */
-		if (spurious_fault(address, error_code))
-			return;
+			if (!(error_code & (PF_RSVD|PF_USER|PF_PROT)) &&
+			    vmalloc_fault(address) >= 0)
+				return;
+
+			/* Can handle a stale RO->RW TLB */
+			if (spurious_fault(address, error_code))
+				return;
+		}

 		/*
 		 * Don't take the mm semaphore here. If we fixup a prefetch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ