lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241129125303.4033164-1-david@redhat.com>
Date: Fri, 29 Nov 2024 13:53:03 +0100
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org,
	David Hildenbrand <david@...hat.com>,
	syzbot+9f9a7f73fb079b2387a6@...kaller.appspotmail.com,
	"Matthew Wilcox (Oracle)" <willy@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Hillf Danton <hdanton@...a.com>
Subject: [PATCH v1] mm/filemap: don't call folio_test_locked() without a reference in next_uptodate_folio()

The folio can get freed + buddy-merged + reallocated in the meantime,
resulting in us calling folio_test_locked() possibly on a tail page.

This makes const_folio_flags VM_BUG_ON_PGFLAGS() when stumbling over
the tail page.

Could this result in other issues? Doesn't look like it. False positives
and false negatives don't really matter, because this folio would get
skipped either way when detecting that they have been reallocated in
the meantime.

Fix it by performing the folio_test_locked() checked after grabbing a
reference. If this ever becomes a real problem, we could add a special
helper that racily checks if the bit is set even on tail pages ... but
let's hope that's not required so we can just handle it cleaner:
work on the folio after we hold a reference.

Do we really need the folio_test_locked() check if we are going to
trylock briefly after? Well, we can at least avoid a xas_reload().

It's a bit unclear which exact change introduced that issue. Likely,
ever since we made PG_locked obey to the PF_NO_TAIL policy it could have
been triggered in some way.

Reported-by: syzbot+9f9a7f73fb079b2387a6@...kaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/674184c9.050a0220.1cc393.0001.GAE@google.com/
Fixes: 48c935ad88f5 ("page-flags: define PG_locked behavior on compound pages")
Cc: "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Hillf Danton <hdanton@...a.com>
Signed-off-by: David Hildenbrand <david@...hat.com>
---
 mm/filemap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 7c76a123ba18b..f61cf51c22389 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3501,10 +3501,10 @@ static struct folio *next_uptodate_folio(struct xa_state *xas,
 			continue;
 		if (xa_is_value(folio))
 			continue;
-		if (folio_test_locked(folio))
-			continue;
 		if (!folio_try_get(folio))
 			continue;
+		if (folio_test_locked(folio))
+			goto skip;
 		/* Has the page moved or been split? */
 		if (unlikely(folio != xas_reload(xas)))
 			goto skip;
-- 
2.47.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ