lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250602141610.173698-3-osalvador@suse.de>
Date: Mon,  2 Jun 2025 16:16:09 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Muchun Song <muchun.song@...ux.dev>,
	David Hildenbrand <david@...hat.com>,
	James Houghton <jthoughton@...gle.com>,
	Peter Xu <peterx@...hat.com>,
	Gavin Guo <gavinguo@...lia.com>,
	linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Oscar Salvador <osalvador@...e.de>
Subject: [RFC PATCH 2/3] mm, hugetlb: Update comments in hugetlb_fault

There are some comments in the hugetlb faulting path that are not
holding anymore because the code has changed.
Most promiment is this one:

 /*
  * vmf.orig_pte could be a migration/hwpoison vmf.orig_pte at this
  * point, so this check prevents the kernel from going below assuming
  * that we have an active hugepage in pagecache. This goto expects
  * the 2nd page fault, and is_hugetlb_entry_(migration|hwpoisoned)
  * check will properly handle it.
  */

This was written because back in the day we used to do:

 hugetlb_fault
  ptep = huge_pte_offset(...)
  if (ptep) {
    entry = huge_ptep_get(ptep)
    if (unlikely(is_hugetlb_entry_migration(entry))
        ...
    else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
        ...

  ...
  ...

 /*
  * entry could be a migration/hwpoison entry at this point, so this
  * check prevents the kernel from going below assuming that we have
  * a active hugepage in pagecache. This goto expects the 2nd page fault,
  * and is_hugetlb_entry_(migration|hwpoisoned) check will properly
  * handle it.
  */
 if (!pte_present(entry))
         goto out_mutex;
 ...

The code was designed to check for hwpoisoned/migration entries upfront,
and then bail out if further down the pte was not present anymore,
relying on taking the second fault so the code from the beginning would
handle it this time for real.

This has changed, so this comment does not hold.
Also, mention in hugetlb_fault() that besides allocation and instantiation,
the mutex also serializes against truncation.

Signed-off-by: Oscar Salvador <osalvador@...e.de>
---
 mm/hugetlb.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f7bef660ef94..6ef90958839f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6715,9 +6715,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	};
 
 	/*
-	 * Serialize hugepage allocation and instantiation, so that we don't
-	 * get spurious allocation failures if two CPUs race to instantiate
-	 * the same page in the page cache.
+	 * hugetlb_fault_mutex_hash serializes allocation, instantiation and
+	 * truncation.
 	 */
 	mapping = vma->vm_file->f_mapping;
 	hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
@@ -6765,11 +6764,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	ret = 0;
 
 	/*
-	 * vmf.orig_pte could be a migration/hwpoison vmf.orig_pte at this
-	 * point, so this check prevents the kernel from going below assuming
-	 * that we have an active hugepage in pagecache. This goto expects
-	 * the 2nd page fault, and is_hugetlb_entry_(migration|hwpoisoned)
-	 * check will properly handle it.
+	 * If it is not present, it means we are dealing either with a migration
+	 * or hwpoisoned entry.
 	 */
 	if (!pte_present(vmf.orig_pte)) {
 		if (unlikely(is_hugetlb_entry_migration(vmf.orig_pte))) {
@@ -6793,8 +6789,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * If we are going to COW/unshare the mapping later, we examine the
 	 * pending reservations for this page now. This will ensure that any
 	 * allocations necessary to record that reservation occur outside the
-	 * spinlock. Also lookup the pagecache page now as it is used to
-	 * determine if a reservation has been consumed.
+	 * spinlock.
 	 */
 	if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) &&
 	    !(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(vmf.orig_pte)) {
-- 
2.49.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ