linux-kernel - Handle updating of ACCESSED and DIRTY in hugetlb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <20080820003645.GC26611@yookeroo.seuss>
Date:	Wed, 20 Aug 2008 10:36:45 +1000
From:	David Gibson <david@...son.dropbear.id.au>
To:	William Lee Irwin <wli@...omorphy.com>,
	Andrew Morton <akpm@...l.org>
Cc:	libhugetlbfs-deve@...ts.sourceforge.net,
	linux-kernel@...r.kernel.org
Subject: Handle updating of ACCESSED and DIRTY in hugetlb_fault()

The page fault path for normal pages, if the fault is neither a
no-page fault nor a write-protect fault, will update the DIRTY and
ACCESSED bits in the page table appropriately.

The hugepage fault path, however, does not do this, handling only
no-page or write-protect type faults.  It assumes that either the
ACCESSED and DIRTY bits are irrelevant for hugepages (usually true,
since they are never swapped) or that they are handled by the arch
code.

This is inconvenient for some software-loaded TLB architectures, where
the _PAGE_ACCESSED (_PAGE_DIRTY) bits need to be set to enable read
(write) access to the page at the TLB miss.  This could be worked
around in the arch TLB miss code, but the TLB miss fast path can be
made simple more easily if the hugetlb_fault() path handles this, as
the normal page fault path does.

Signed-off-by: David Gibson <david@...son.dropbear.id.au>

---

RFC, looking to merge for 2.6.28.

Index: working-2.6/mm/hugetlb.c
===================================================================
--- working-2.6.orig/mm/hugetlb.c	2008-08-19 15:14:51.000000000 +1000
+++ working-2.6/mm/hugetlb.c	2008-08-19 15:28:27.000000000 +1000
@@ -2008,7 +2008,7 @@ int hugetlb_fault(struct mm_struct *mm, 
 	entry = huge_ptep_get(ptep);
 	if (huge_pte_none(entry)) {
 		ret = hugetlb_no_page(mm, vma, address, ptep, write_access);
-		goto out_unlock;
+		goto out_mutex;
 	}
 
 	ret = 0;
@@ -2024,7 +2024,7 @@ int hugetlb_fault(struct mm_struct *mm, 
 	if (write_access && !pte_write(entry)) {
 		if (vma_needs_reservation(h, vma, address) < 0) {
 			ret = VM_FAULT_OOM;
-			goto out_unlock;
+			goto out_mutex;
 		}
 
 		if (!(vma->vm_flags & VM_SHARED))
@@ -2034,10 +2034,23 @@ int hugetlb_fault(struct mm_struct *mm, 
 
 	spin_lock(&mm->page_table_lock);
 	/* Check for a racing update before calling hugetlb_cow */
-	if (likely(pte_same(entry, huge_ptep_get(ptep))))
-		if (write_access && !pte_write(entry))
+	if (unlikely(!pte_same(entry, huge_ptep_get(ptep))))
+		goto out_page_table_lock;
+
+
+	if (write_access) {
+		if (!pte_write(entry)) {
 			ret = hugetlb_cow(mm, vma, address, ptep, entry,
 							pagecache_page);
+			goto out_page_table_lock;
+		}
+		entry = pte_mkdirty(entry);
+	}
+	entry = pte_mkyoung(entry);
+	if (huge_ptep_set_access_flags(vma, address, ptep, entry, write_access))
+		update_mmu_cache(vma, address, entry);
+
+out_page_table_lock:
 	spin_unlock(&mm->page_table_lock);
 
 	if (pagecache_page) {
@@ -2045,7 +2058,7 @@ int hugetlb_fault(struct mm_struct *mm, 
 		put_page(pagecache_page);
 	}
 
-out_unlock:
+out_mutex:
 	mutex_unlock(&hugetlb_instantiation_mutex);
 
 	return ret;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/