lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250924181138.1762750-1-charan.kalla@oss.qualcomm.com>
Date: Wed, 24 Sep 2025 23:41:38 +0530
From: Charan Teja Kalla <charan.kalla@....qualcomm.com>
To: david@...hat.com, Liam.Howlett@...cle.com, lorenzo.stoakes@...cle.com,
        akpm@...ux-foundation.org, shikemeng@...weicloud.com,
        kasong@...cent.com, nphamcs@...il.com, bhe@...hat.com,
        baohua@...nel.org, chrisl@...nel.org, zhangpeng.00@...edance.com
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Charan Teja Kalla <charan.kalla@....qualcomm.com>
Subject: [PATCH V2] mm: swap: check for stable address space before operating on the VMA

It is possible to hit a zero entry while traversing the vmas in
unuse_mm() called from swapoff path and accessing it causes the
OOPS:

Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault

The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap())			swapoff(unuse_mm)
---------------                         -----------------
1) Identical mtree is built using
   __mt_dup().

2) copy_pte_range()-->
	copy_nonpresent_pte():
       The dst mm is added into the
    mmlist to be visible to the
    swapoff operation.

3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().

				     4) swapoff is tried on the
					'mm' added to the 'mmlist' as
					part of the 2.

				     5) unuse_mm(), that iterates
					through the vma's of this 'mm'
					will hit the non-NULL zero entry
					and operating on this zero entry
					as a vma is resulting into the
					oops.

The proper fix would be around not exposing this partially-valid tree to
others when droping the mmap lock, which is being solved with [1]. A
simpler solution would be checking for MMF_UNSTABLE, as it is set if
mm_struct is not fully initialized in dup_mmap().

Thanks to Liam/Lorenzo/David for all the suggestions in fixing this
issue.

[1] https://lore.kernel.org/all/20250815191031.3769540-1-Liam.Howlett@oracle.com/

Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Suggested-by: David Hildenbrand <david@...hat.com>
Signed-off-by: Charan Teja Kalla <charan.kalla@....qualcomm.com>
---

V1:
   -- Checking for xa_zero_entry() instead of most cleaner way of
checking for MMF_UNSTABLE
   -- https://lore.kernel.org/linux-mm/20250808092156.1918973-1-quic_charante@quicinc.com/

 mm/swapfile.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 890b410d77b6..10760240a3a2 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2389,6 +2389,8 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
 	VMA_ITERATOR(vmi, mm, 0);
 
 	mmap_read_lock(mm);
+	if (check_stable_address_space(mm))
+		goto unlock;
 	for_each_vma(vmi, vma) {
 		if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
 			ret = unuse_vma(vma, type);
@@ -2398,6 +2400,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
 
 		cond_resched();
 	}
+unlock:
 	mmap_read_unlock(mm);
 	return ret;
 }
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ