linux-kernel - [PATCH] mm: __delete_from_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.11.1602292244320.7377@eggly.anvils>
Date:	Mon, 29 Feb 2016 22:45:59 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	"Kirill A. Shutemov" <kirill@...temov.name>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Sasha Levin <sasha.levin@...cle.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH] mm: __delete_from_page_cache show Bad page if mapped

Commit e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount()
for compound pages") changed the famous BUG_ON(page_mapped(page)) in
__delete_from_page_cache() to VM_BUG_ON_PAGE(page_mapped(page)): which
gives us more info when CONFIG_DEBUG_VM=y, but nothing at all when not.

Although it has not usually been very helpul, being hit long after the
error in question, we do need to know if it actually happens on users'
systems; but reinstating a crash there is likely to be opposed :)

In the non-debug case, pr_alert("BUG: Bad page cache") plus dump_page(),
dump_stack(), add_taint() - I don't really believe LOCKDEP_NOW_UNRELIABLE,
but that seems to be the standard procedure now.  Move that, or the
VM_BUG_ON_PAGE(), up before the deletion from tree: so that the
unNULLified page->mapping gives a little more information.

If the inode is being evicted (rather than truncated), it won't have
any vmas left, so it's safe(ish) to assume that the raised mapcount is
erroneous, and we can discount it from page_count to avoid leaking the
page (I'm less worried by leaking the occasional 4kB, than losing a
potential 2MB page with each 4kB page leaked).

Signed-off-by: Hugh Dickins <hughd@...gle.com>
---
I think this should go into v4.5, so I've written it with an atomic_sub
on page->_count; Joonsoo has noticed, and kindly agreed to page_ref'ify
it for mmotm after it's merged.

 mm/filemap.c |   25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

--- 4.5-rc6/mm/filemap.c	2016-02-28 09:04:38.816707844 -0800
+++ linux/mm/filemap.c	2016-02-29 22:04:30.229738939 -0800
@@ -195,6 +195,30 @@ void __delete_from_page_cache(struct pag
 	else
 		cleancache_invalidate_page(mapping, page);

+	VM_BUG_ON_PAGE(page_mapped(page), page);
+	if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(page_mapped(page))) {
+		int mapcount;
+
+		pr_alert("BUG: Bad page cache in process %s  pfn:%05lx\n",
+			 current->comm, page_to_pfn(page));
+		dump_page(page, "still mapped when deleted");
+		dump_stack();
+		add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+
+		mapcount = page_mapcount(page);
+		if (mapping_exiting(mapping) &&
+		    page_count(page) >= mapcount + 2) {
+			/*
+			 * All vmas have already been torn down, so it's
+			 * a good bet that actually the page is unmapped,
+			 * and we'd prefer not to leak it: if we're wrong,
+			 * some other bad page check should catch it later.
+			 */
+			page_mapcount_reset(page);
+			atomic_sub(mapcount, &page->_count);
+		}
+	}
+
 	page_cache_tree_delete(mapping, page, shadow);

 	page->mapping = NULL;
@@ -205,7 +229,6 @@ void __delete_from_page_cache(struct pag
 		__dec_zone_page_state(page, NR_FILE_PAGES);
 	if (PageSwapBacked(page))
 		__dec_zone_page_state(page, NR_SHMEM);
-	VM_BUG_ON_PAGE(page_mapped(page), page);

 	/*
 	 * At this point page must be either written or cleaned by truncate.