lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 19 Apr 2010 16:32:45 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	r6144 <rainy6144@...il.com>, linux-kernel@...r.kernel.org,
	Darren Hart <dvhltc@...ibm.com>, tglx <tglx@...utronix.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Lee Schermerhorn <lee.schermerhorn@...com>
Subject: Re: Process-shared futexes on hugepages puts the kernel in an
	infinite loop in 2.6.32.11; is this fixed now?

On Mon, Apr 19, 2010 at 01:52:36PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-04-19 at 12:43 +0100, Mel Gorman wrote:
> 
> > > Right, so I had a quick chat with Mel, and it appears MAP_PRIVATE
> > > hugetlb pages don't have their page->mapping set.
> > > 
> > > I guess something like the below might work, but I'd really rather not
> > > add hugetlb knowledge to futex.c. Does anybody else have a better idea?
> > > Maybe create something similar to an anon_vma for hugetlb pages?
> > > 
> > 
> > anon_vma for hugetlb pages sounds overkill, what would it gain? In this
> > context, futex only appears to distinguish between whether the
> > references are private or shared.
> > 
> > Looking at the hugetlbfs code, I can't see a place where it actually cares
> > about the mapping as such. It's used to find shared pages in the page cache
> > (but not in the LRU) that are backed by the hugetlbfs file. For hugetlbfs
> > though, the mapping is mostly kept in page->private for reservation accounting
> > purposes.
> > 
> > I can't think of other parts of the VM that touch the mapping if the
> > page is managed by hugetlbfs so the following patch should also work but
> > without futex having hugetlbfs-awareness. What do you think? Maybe for
> > safety, it would be better to make the mapping some obvious poison bytes
> > or'd with PAGE_MAPPING_ANON so an oops will be more obvious?
> 
> Yes, this seems perfectly adequate to me, that poison idea might be
> worthwhile too :-)
> 
> Acked-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> 

Are you ok with an Ack to this slightly-difference patch too? If so,
I'll forward it on to Andrew.

==== CUT HERE ====

Fix infinite loop in get_futex_key when backed by huge pages

If a futex key happens to be located within a huge page mapped MAP_PRIVATE,
get_futex_key() can go into an infinite loop waiting for a page->mapping
that will never exist. This was reported and documented in an external
bugzilla at

https://bugzilla.redhat.com/show_bug.cgi?id=552257

This patch makes page->mapping a poisoned value that includes PAGE_MAPPING_ANON
mapped MAP_PRIVATE.  This is enough for futex to continue but because
of PAGE_MAPPING_ANON, the poisoned value is not dereferenced or used by
futex. No other part of the VM should be dereferencing the page->mapping of
a hugetlbfs page as its page cache is not on the LRU.

This patch fixes the problem with the test case described in the bugzilla.

Signed-off-by: Mel Gorman <mel@....ul.ie>
---
 include/linux/poison.h |   10 ++++++++++
 mm/hugetlb.c           |    6 +++++-
 2 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/include/linux/poison.h b/include/linux/poison.h
index 2110a81..0f7b5ac 100644
--- a/include/linux/poison.h
+++ b/include/linux/poison.h
@@ -48,6 +48,16 @@
 #define POISON_FREE	0x6b	/* for use-after-free poisoning */
 #define	POISON_END	0xa5	/* end-byte of poisoning */
 
+/********** mm/hugetlb.c **********/
+/*
+ * Private mappings of hugetlb pages use this poisoned value for
+ * page->mapping. The core VM should not be doing anything with this mapping
+ * but futex requires the existance of some page->mapping value even if it
+ * is unused. If the core VM does deference the mapping, it'll look like a
+ * suspiciously high null-pointer offset starting from 0x2e5
+ */
+#define HUGETLB_PRIVATE_MAPPING	(0x2e4 | PAGE_MAPPING_ANON)
+
 /********** arch/$ARCH/mm/init.c **********/
 #define POISON_FREE_INITMEM	0xcc
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6034dc9..487e3c2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -546,6 +546,7 @@ static void free_huge_page(struct page *page)
 
 	mapping = (struct address_space *) page_private(page);
 	set_page_private(page, 0);
+	page->mapping = NULL;
 	BUG_ON(page_count(page));
 	INIT_LIST_HEAD(&page->lru);
 
@@ -2447,8 +2448,11 @@ retry:
 			spin_lock(&inode->i_lock);
 			inode->i_blocks += blocks_per_huge_page(h);
 			spin_unlock(&inode->i_lock);
-		} else
+		} else {
 			lock_page(page);
+			page->mapping = (struct address_space *)
+							HUGETLB_PRIVATE_MAPPING;
+		}
 	}
 
 	/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ