lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250330064732.3781046-2-mcgrof@kernel.org>
Date: Sat, 29 Mar 2025 23:47:30 -0700
From: Luis Chamberlain <mcgrof@...nel.org>
To: brauner@...nel.org,
	jack@...e.cz,
	tytso@....edu,
	adilger.kernel@...ger.ca,
	linux-ext4@...r.kernel.org,
	riel@...riel.com
Cc: willy@...radead.org,
	hannes@...xchg.org,
	oliver.sang@...el.com,
	dave@...olabs.net,
	david@...hat.com,
	axboe@...nel.dk,
	hare@...e.de,
	david@...morbit.com,
	djwong@...nel.org,
	ritesh.list@...il.com,
	linux-fsdevel@...r.kernel.org,
	linux-block@...r.kernel.org,
	linux-mm@...ck.org,
	gost.dev@...sung.com,
	p.raghav@...sung.com,
	da.gomez@...sung.com,
	mcgrof@...nel.org
Subject: [PATCH 1/3] mm/migrate: add might_sleep() on __migrate_folio()

When we do page migration of large folios folio_mc_copy() can
cond_resched() *iff* we are on a large folio. There's a hairy
bug reported by both 0-day [0] and  syzbot [1] where it has been
detected we can call folio_mc_copy() in atomic context. While,
technically speaking that should in theory be only possible today
from buffer-head filesystems using buffer_migrate_folio_norefs()
on page migration the only buffer-head large folio filesystem -- the
block device cache, and so with block devices with large block sizes.
However tracing shows that folio_mc_copy() *isn't* being called
as often as we'd expect from buffer_migrate_folio_norefs() path
as we're likely bailing early now thanks to the check added by commit
060913999d7a ("mm: migrate: support poisoned recover from migrate
folio").

*Most* folio_mc_copy() calls in turn end up *not* being in atomic
context, and so we won't hit a splat when using:

CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_ATOMIC_SLEEP=y

But we *want* to help proactively find callers of __migrate_folio() in
atomic context, so make might_sleep() explicit to help us root out
large folio atomic callers of migrate_folio().

Link: https://lkml.kernel.org/r/202503101536.27099c77-lkp@intel.com # [0]
Link: https://lkml.kernel.org/r/67e57c41.050a0220.2f068f.0033.GAE@google.com # [1]
Link: https://lkml.kernel.org/r/Z-c6BqCSmAnNxb57@bombadil.infradead.org # [2]
Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
---
 mm/migrate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index f3ee6d8d5e2e..712ddd11f3f0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -751,6 +751,8 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst,
 {
 	int rc, expected_count = folio_expected_refs(mapping, src);
 
+	might_sleep();
+
 	/* Check whether src does not have extra refs before we do more work */
 	if (folio_ref_count(src) != expected_count)
 		return -EAGAIN;
-- 
2.47.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ