lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BCC7EF0.7060107@us.ibm.com>
Date:	Mon, 19 Apr 2010 09:04:00 -0700
From:	Darren Hart <dvhltc@...ibm.com>
To:	Mel Gorman <mel@....ul.ie>
CC:	Peter Zijlstra <peterz@...radead.org>, r6144 <rainy6144@...il.com>,
	linux-kernel@...r.kernel.org, tglx <tglx@...utronix.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Lee Schermerhorn <lee.schermerhorn@...com>
Subject: Re: Process-shared futexes on hugepages puts the kernel in an infinite
 loop in 2.6.32.11; is this fixed now?

Mel Gorman wrote:
> On Mon, Apr 19, 2010 at 01:52:36PM +0200, Peter Zijlstra wrote:
>> On Mon, 2010-04-19 at 12:43 +0100, Mel Gorman wrote:
>>
>>>> Right, so I had a quick chat with Mel, and it appears MAP_PRIVATE
>>>> hugetlb pages don't have their page->mapping set.
>>>>
>>>> I guess something like the below might work, but I'd really rather not
>>>> add hugetlb knowledge to futex.c. Does anybody else have a better idea?
>>>> Maybe create something similar to an anon_vma for hugetlb pages?
>>>>
>>> anon_vma for hugetlb pages sounds overkill, what would it gain? In this
>>> context, futex only appears to distinguish between whether the
>>> references are private or shared.
>>>
>>> Looking at the hugetlbfs code, I can't see a place where it actually cares
>>> about the mapping as such. It's used to find shared pages in the page cache
>>> (but not in the LRU) that are backed by the hugetlbfs file. For hugetlbfs
>>> though, the mapping is mostly kept in page->private for reservation accounting
>>> purposes.
>>>
>>> I can't think of other parts of the VM that touch the mapping if the
>>> page is managed by hugetlbfs so the following patch should also work but
>>> without futex having hugetlbfs-awareness. What do you think? Maybe for
>>> safety, it would be better to make the mapping some obvious poison bytes
>>> or'd with PAGE_MAPPING_ANON so an oops will be more obvious?
>> Yes, this seems perfectly adequate to me, that poison idea might be
>> worthwhile too :-)
>>
>> Acked-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>>
> 
> Are you ok with an Ack to this slightly-difference patch too? If so,
> I'll forward it on to Andrew.

Thanks Mel, I'm also keen to keep hugetlb awareness out of futex.c, it's 
crazy enough in there already!

Acked-by: Darren Hart <darren@...art.com>

r6144, would you consider taking a look at the futextest test suite and 
letting me know where you think it might need improvement to catch what 
you ran into in your tests. I'm thinking some options to futex_wait and 
possibly other tests to tell it where to map the futex pages.

http://git.kernel.org/?p=linux/kernel/git/dvhart/futextest.git;a=summary

Thanks,

Darren Hart


> 
> ==== CUT HERE ====
> 
> Fix infinite loop in get_futex_key when backed by huge pages
> 
> If a futex key happens to be located within a huge page mapped MAP_PRIVATE,
> get_futex_key() can go into an infinite loop waiting for a page->mapping
> that will never exist. This was reported and documented in an external
> bugzilla at
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=552257
> 
> This patch makes page->mapping a poisoned value that includes PAGE_MAPPING_ANON
> mapped MAP_PRIVATE.  This is enough for futex to continue but because
> of PAGE_MAPPING_ANON, the poisoned value is not dereferenced or used by
> futex. No other part of the VM should be dereferencing the page->mapping of
> a hugetlbfs page as its page cache is not on the LRU.
> 
> This patch fixes the problem with the test case described in the bugzilla.
> 
> Signed-off-by: Mel Gorman <mel@....ul.ie>
> ---
>  include/linux/poison.h |   10 ++++++++++
>  mm/hugetlb.c           |    6 +++++-
>  2 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/poison.h b/include/linux/poison.h
> index 2110a81..0f7b5ac 100644
> --- a/include/linux/poison.h
> +++ b/include/linux/poison.h
> @@ -48,6 +48,16 @@
>  #define POISON_FREE	0x6b	/* for use-after-free poisoning */
>  #define	POISON_END	0xa5	/* end-byte of poisoning */
> 
> +/********** mm/hugetlb.c **********/
> +/*
> + * Private mappings of hugetlb pages use this poisoned value for
> + * page->mapping. The core VM should not be doing anything with this mapping
> + * but futex requires the existance of some page->mapping value even if it
> + * is unused. If the core VM does deference the mapping, it'll look like a
> + * suspiciously high null-pointer offset starting from 0x2e5
> + */
> +#define HUGETLB_PRIVATE_MAPPING	(0x2e4 | PAGE_MAPPING_ANON)
> +
>  /********** arch/$ARCH/mm/init.c **********/
>  #define POISON_FREE_INITMEM	0xcc
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6034dc9..487e3c2 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -546,6 +546,7 @@ static void free_huge_page(struct page *page)
> 
>  	mapping = (struct address_space *) page_private(page);
>  	set_page_private(page, 0);
> +	page->mapping = NULL;
>  	BUG_ON(page_count(page));
>  	INIT_LIST_HEAD(&page->lru);
> 
> @@ -2447,8 +2448,11 @@ retry:
>  			spin_lock(&inode->i_lock);
>  			inode->i_blocks += blocks_per_huge_page(h);
>  			spin_unlock(&inode->i_lock);
> -		} else
> +		} else {
>  			lock_page(page);
> +			page->mapping = (struct address_space *)
> +							HUGETLB_PRIVATE_MAPPING;
> +		}
>  	}
> 
>  	/*


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ