lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44.0402190642510.15577-200000@D00M.integrate.com.ru>
From: dan at D00M.integrate.com.ru (Dan Yefimov)
Subject: Re: Second critical mremap() bug found in all Linux kernels

On Wed, 18 Feb 2004, Paul Starzetz wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Synopsis:  Linux kernel do_mremap VMA limit local privilege escalation
>            vulnerability
> Product:   Linux kernel
> Version:   2.2 up to 2.2.25, 2.4 up to 2.4.24, 2.6 up to 2.6.2
> Vendor:    http://www.kernel.org/
> URL:       http://isec.pl/vulnerabilities/isec-0014-mremap-unmap.txt
> CVE:       CAN-2004-0077
> Author:    Paul Starzetz <ihaquer@...c.pl>
> Date:      February 18, 2004
> 
> 
> Issue:
> ======
> 
> A critical security vulnerability has been found in the Linux kernel 
> memory management code inside the mremap(2) system call due to missing 
> function return value check. This bug is completely unrelated to the 
> mremap bug disclosed on 05-01-2004 except concerning the same internal 
> kernel function code.
> 
> 
> Details:
> ========
> 
> The Linux kernel manages a list of user addressable valid memory 
> locations on a per process basis. Every process owns a single linked 
> list of so called virtual memory area descriptors (called from now on 
> just VMAs). Every VMA describes the start of a valid memory region, its 
> length and moreover various memory flags like page protection. 
> 
> Every VMA in the list corresponds to a part of the process's page table. 
> The page table contains descriptors (in short page table entries PTEs) 
> of physical memory pages seen by the process. The VMA descriptor can be 
> thus understood as a high level description of a particular region of 
> the process's page table storing PTE properties like page R/W flag and 
> so on.
> 
> The mremap() system call provides resizing (shrinking or growing) as 
> well as moving of existing virtual memory areas or any of its parts 
> across process's addressable space.
> 
> Moving a part of the virtual memory from inside a VMA area to a new 
> location requires creation of a new VMA descriptor as well as copying 
> the underlying page table entries described by the VMA from the old to 
> the new location in the process's page table.
> 
> To accomplish this task the do_mremap code calls the do_munmap() 
> internal kernel function to remove any potentially existing old memory 
> mapping in the new location as well as to remove the old virtual memory 
> mapping. Unfortunately the code doesn't test the return value of the 
> do_munmap() function which may fail if the maximum number of available 
> VMA descriptors has been exceeded. This happens if one tries to unmap 
> middle part of an existing memory mapping and the process's limit on the 
> number of VMAs has been reached (which is currently 65535).
> 
> One of the possible situations can be illustrated with the following 
> picture. The corresponding page table entries (PTEs) have been marked 
> with o and x:
> 
> Before mremap():
> 
> (oooooooooooooooooooooooo)     (xxxxxxxxxxxx)
> [----------VMA1----------]     [----VMA2----]
>       [REMAPPED-VMA] <---------------|
> 
> 
> After mremap() without VMA limit:
> 
> (oooo)(xxxxxxxxxxxx)(oooo)
> [VMA3][REMAPPED-VMA][VMA4]
> 
> 
> After mremap() but VMA limit:
> 
> (ooooxxxxxxxxxxxxxxoooo)
> [---------VMA1---------]
>      [REMAPPED-VMA]
> 
> 
> After the maximum number of VMAs in the process's VMA list has been 
> reached do_munmap() will refuse to create the necessary VMA hole because 
> it would split the original VMA in two disjoint VMA areas exceeding the 
> VMA descriptor limit.
> 
> Due to the missing return value check after trying to unmap the middle 
> of the VMA1 (this is the first invocation of do_munmap inside do_mremap 
> code) the corresponding page table entries from VMA2 are still inserted 
> into the page table location described by VMA1 thus being subject to 
> VMA1 page protection flags. It must be also mentioned that the original 
> PTEs in the VMA1 are lost thus leaving the corresponding page frames 
> unusable for ever.
> 
> The kernel also tries to insert the overlapping VMA area into the VMA 
> descriptor list but this fails due to further checks in the low level 
> VMA manipulation code. The low level VMA list check in the 2.4 and 2.6 
> kernel versions just call BUG() therefore terminating the malicious 
> process.
> 
> There are also two other unchecked calls to do_munmap() inside the 
> do_mremap() code and we believe that the second occurrence of unchecked 
> do_munmap is also exploitable. The second occurrence takes place if the 
> VMA to be remapped is beeing truncated in place. Note that do_munmap can 
> also fail on an exceptional low memory condition while trying to 
> allocate a VMA descriptor.
> 
> We were able to create a robust proof-of-concept exploit code giving 
> full super-user privileges on all vulnerable kernel versions. The 
> exploit code will be released next week.
> 
> 
> Impact:
> =======
> 
> Since no special privileges are required to use the mremap(2) system 
> call any process may use its unexpected behavior to disrupt the kernel 
> memory management subsystem.
> 
> Proper exploitation of this vulnerability leads to local privilege 
> escalation giving an attacker full super-user privileges. The 
> vulnerability may also lead to a denial-of-service attack on the 
> available system memory.
> 
> Tested and known to be vulnerable kernel versions are all <= 2.2.25, <= 
> 2.4.24 and <= 2.6.1. The 2.2.25 version of Linux kernel does not 
> recognize the MREMAP_FIXED flag but this does not prevent the bug from 
> being successfully exploited. All users are encouraged to patch all 
> vulnerable systems as soon as appropriate vendor patches are released. 
> There is no hotfix for this vulnerablity. Limited per user virtual 
> memory still permits do_munmap() to fail.
> 
> 
> Credits:
> ========
> 
> Paul Starzetz <ihaquer@...c.pl> has identified the vulnerability and 
> performed further research. COPYING, DISTRIBUTION, AND MODIFICATION OF 
> INFORMATION PRESENTED HERE IS ALLOWED ONLY WITH EXPRESS PERMISSION OF 
> ONE OF THE AUTHORS.
> 
> 
> Disclaimer:
> ===========
> 
> This document and all the information it contains are provided "as is", 
> for educational purposes only, without warranty of any kind, whether 
> express or implied.
> 
> The authors reserve the right not to be responsible for the topicality, 
> correctness, completeness or quality of the information  provided in 
> this document. Liability claims regarding damage caused by the use of 
> any information provided, including any kind of information which is 
> incomplete or incorrect, will therefore be rejected.
> 
> - -- 
> Paul Starzetz
> iSEC Security Research
> http://isec.pl/
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.7 (GNU/Linux)
> 
> iD8DBQFAM1QzC+8U3Z5wpu4RAqXzAKCMOkFu1mXzzRgLyuFYp4ORpQCQDgCfe4M2
> 3IjbGvzniOjv/Hc7KKAzMtU=
> =GJds
> -----END PGP SIGNATURE-----
> 
> 
Attached patch fixes this bug for kernel 2.2.25. It should also apply cleanly to 
kernels since at least 2.2.21.
-- 

    Sincerely Your, Dan.
-------------- next part --------------
--- linux/mm/mremap.c.security	Sun Mar 25 20:31:03 2001
+++ linux/mm/mremap.c	Thu Feb 19 05:10:34 2004
@@ -9,6 +9,7 @@
 #include <linux/shm.h>
 #include <linux/mman.h>
 #include <linux/swap.h>
+#include <linux/file.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -25,7 +26,7 @@
 	if (pgd_none(*pgd))
 		goto end;
 	if (pgd_bad(*pgd)) {
-		printk("move_one_page: bad source pgd (%08lx)\n", pgd_val(*pgd));
+		printk("copy_one_page: bad source pgd (%08lx)\n", pgd_val(*pgd));
 		pgd_clear(pgd);
 		goto end;
 	}
@@ -34,7 +35,7 @@
 	if (pmd_none(*pmd))
 		goto end;
 	if (pmd_bad(*pmd)) {
-		printk("move_one_page: bad source pmd (%08lx)\n", pmd_val(*pmd));
+		printk("copy_one_page: bad source pmd (%08lx)\n", pmd_val(*pmd));
 		pmd_clear(pmd);
 		goto end;
 	}
@@ -57,34 +58,22 @@
 	return pte;
 }
 
-static inline int copy_one_pte(pte_t * src, pte_t * dst)
+static int copy_one_page(struct mm_struct *mm, unsigned long old_addr, unsigned long new_addr)
 {
-	int error = 0;
-	pte_t pte = *src;
+	pte_t * src, * dst;
 
-	if (!pte_none(pte)) {
-		error++;
-		if (dst) {
-			pte_clear(src);
-			set_pte(dst, pte);
-			error--;
+	src = get_one_pte(mm, old_addr);
+	if (src && !pte_none(*src)) {
+		if ((dst = alloc_one_pte(mm, new_addr))) {
+			set_pte(dst, *src);
+			return 0;
 		}
+		return 1;
 	}
-	return error;
-}
-
-static int move_one_page(struct mm_struct *mm, unsigned long old_addr, unsigned long new_addr)
-{
-	int error = 0;
-	pte_t * src;
-
-	src = get_one_pte(mm, old_addr);
-	if (src)
-		error = copy_one_pte(src, alloc_one_pte(mm, new_addr));
-	return error;
+	return 0;
 }
 
-static int move_page_tables(struct mm_struct * mm,
+static int copy_page_tables(struct mm_struct * mm,
 	unsigned long new_addr, unsigned long old_addr, unsigned long len)
 {
 	unsigned long offset = len;
@@ -99,7 +88,7 @@
 	 */
 	while (offset) {
 		offset -= PAGE_SIZE;
-		if (move_one_page(mm, old_addr + offset, new_addr + offset))
+		if (copy_one_page(mm, old_addr + offset, new_addr + offset))
 			goto oops_we_failed;
 	}
 	return 0;
@@ -113,8 +102,6 @@
 	 */
 oops_we_failed:
 	flush_cache_range(mm, new_addr, new_addr + len);
-	while ((offset += PAGE_SIZE) < len)
-		move_one_page(mm, new_addr + offset, old_addr + offset);
 	zap_page_range(mm, new_addr, len);
 	flush_tlb_range(mm, new_addr, new_addr + len);
 	return -1;
@@ -129,7 +116,9 @@
 	if (new_vma) {
 		unsigned long new_addr = get_unmapped_area(addr, new_len);
 
-		if (new_addr && !move_page_tables(current->mm, new_addr, addr, old_len)) {
+		if (new_addr && !copy_page_tables(current->mm, new_addr, addr, old_len)) {
+			unsigned long ret;
+
 			*new_vma = *vma;
 			new_vma->vm_start = new_addr;
 			new_vma->vm_end = new_addr+new_len;
@@ -138,9 +127,19 @@
 				new_vma->vm_file->f_count++;
 			if (new_vma->vm_ops && new_vma->vm_ops->open)
 				new_vma->vm_ops->open(new_vma);
+			if ((ret = do_munmap(addr, old_len))) {
+				if (new_vma->vm_ops && new_vma->vm_ops->close)
+					new_vma->vm_ops->close(new_vma);
+				if (new_vma->vm_file)
+					fput(new_vma->vm_file);
+				flush_cache_range(current->mm, new_addr, new_addr + old_len);
+				zap_page_range(current->mm, new_addr, old_len);
+				flush_tlb_range(current->mm, new_addr, new_addr + old_len);
+				kmem_cache_free(vm_area_cachep, new_vma);
+				return ret;
+			}
 			insert_vm_struct(current->mm, new_vma);
 			merge_segments(current->mm, new_vma->vm_start, new_vma->vm_end);
-			do_munmap(addr, old_len);
 			current->mm->total_vm += new_len >> PAGE_SHIFT;
 			if (new_vma->vm_flags & VM_LOCKED) {
 				current->mm->locked_vm += new_len >> PAGE_SHIFT;
@@ -176,9 +175,9 @@
 	 * Always allow a shrinking remap: that just unmaps
 	 * the unnecessary pages..
 	 */
-	ret = addr;
 	if (old_len >= new_len) {
-		do_munmap(addr+new_len, old_len - new_len);
+		if (!(ret = do_munmap(addr+new_len, old_len - new_len)))
+			ret = addr;
 		goto out;
 	}
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ