lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZajyqgE3ZHYHSvZC@elver.google.com>
Date: Thu, 18 Jan 2024 10:43:06 +0100
From: Marco Elver <elver@...gle.com>
To: Alexander Potapenko <glider@...gle.com>
Cc: quic_charante@...cinc.com, akpm@...ux-foundation.org,
	aneesh.kumar@...ux.ibm.com, dan.j.williams@...el.com,
	david@...hat.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	mgorman@...hsingularity.net, osalvador@...e.de, vbabka@...e.cz,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev@...glegroups.com,
	Ilya Leoshkevich <iii@...ux.ibm.com>,
	Nicholas Miehlbradt <nicholas@...ux.ibm.com>, rcu@...r.kernel.org
Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage

On Thu, Jan 18, 2024 at 10:01AM +0100, Alexander Potapenko wrote:
> >
> > Hrm, rcu_read_unlock_sched_notrace() can still call
> > __preempt_schedule_notrace(), which is again instrumented by KMSAN.
> >
> > This patch gets me a working kernel:
> >
[...]
> > Disabling interrupts is a little heavy handed - it also assumes the
> > current RCU implementation. There is
> > preempt_enable_no_resched_notrace(), but that might be worse because it
> > breaks scheduling guarantees.
> >
> > That being said, whatever we do here should be wrapped in some
> > rcu_read_lock/unlock_<newvariant>() helper.
> 
> We could as well redefine rcu_read_lock/unlock in mm/kmsan/shadow.c
> (or the x86-specific KMSAN header, depending on whether people are
> seeing the problem on s390 and Power) with some header magic.
> But that's probably more fragile than adding a helper.
> 
> >
> > Is there an existing helper we can use? If not, we need a variant that
> > can be used from extremely constrained contexts that can't even call
> > into the scheduler. And if we want pfn_valid() to switch to it, it also
> > should be fast.

The below patch also gets me a working kernel. For pfn_valid(), using
rcu_read_lock_sched() should be reasonable, given its critical section
is very small and also enables it to be called from more constrained
contexts again (like KMSAN).

Within KMSAN we also have to suppress reschedules. This is again not
ideal, but since it's limited to KMSAN should be tolerable.

WDYT?

------ >8 ------

diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h
index 8fa6ac0e2d76..bbb1ba102129 100644
--- a/arch/x86/include/asm/kmsan.h
+++ b/arch/x86/include/asm/kmsan.h
@@ -64,6 +64,7 @@ static inline bool kmsan_virt_addr_valid(void *addr)
 {
 	unsigned long x = (unsigned long)addr;
 	unsigned long y = x - __START_KERNEL_map;
+	bool ret;
 
 	/* use the carry flag to determine if x was < __START_KERNEL_map */
 	if (unlikely(x > y)) {
@@ -79,7 +80,21 @@ static inline bool kmsan_virt_addr_valid(void *addr)
 			return false;
 	}
 
-	return pfn_valid(x >> PAGE_SHIFT);
+	/*
+	 * pfn_valid() relies on RCU, and may call into the scheduler on exiting
+	 * the critical section. However, this would result in recursion with
+	 * KMSAN. Therefore, disable preemption here, and re-enable preemption
+	 * below while suppressing rescheduls to avoid recursion.
+	 *
+	 * Note, this sacrifices occasionally breaking scheduling guarantees.
+	 * Although, a kernel compiled with KMSAN has already given up on any
+	 * performance guarantees due to being heavily instrumented.
+	 */
+	preempt_disable();
+	ret = pfn_valid(x >> PAGE_SHIFT);
+	preempt_enable_no_resched();
+
+	return ret;
 }
 
 #endif /* !MODULE */
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 4ed33b127821..a497f189d988 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -2013,9 +2013,9 @@ static inline int pfn_valid(unsigned long pfn)
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __pfn_to_section(pfn);
-	rcu_read_lock();
+	rcu_read_lock_sched();
 	if (!valid_section(ms)) {
-		rcu_read_unlock();
+		rcu_read_unlock_sched();
 		return 0;
 	}
 	/*
@@ -2023,7 +2023,7 @@ static inline int pfn_valid(unsigned long pfn)
 	 * the entire section-sized span.
 	 */
 	ret = early_section(ms) || pfn_section_valid(ms, pfn);
-	rcu_read_unlock();
+	rcu_read_unlock_sched();
 
 	return ret;
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ