[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100419114300.GT19264@csn.ul.ie>
Date: Mon, 19 Apr 2010 12:43:00 +0100
From: Mel Gorman <mel@....ul.ie>
To: Peter Zijlstra <peterz@...radead.org>
Cc: r6144 <rainy6144@...il.com>, linux-kernel@...r.kernel.org,
Darren Hart <dvhltc@...ibm.com>, tglx <tglx@...utronix.de>,
Andrea Arcangeli <aarcange@...hat.com>,
Lee Schermerhorn <lee.schermerhorn@...com>
Subject: Re: Process-shared futexes on hugepages puts the kernel in an
infinite loop in 2.6.32.11; is this fixed now?
On Fri, Apr 16, 2010 at 10:27:48PM +0200, Peter Zijlstra wrote:
> On Fri, 2010-04-16 at 23:45 +0800, r6144 wrote:
> > Hello all,
> >
> > I'm having an annoying kernel bug regarding huge pages in Fedora 12:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=552257
> >
> > Basically I want to use huge pages in a multithreaded number crunching
> > program, which happens to use process-shared semaphores (because fftw
> > does it). The futex for the semaphore ends up lying on a huge page, and
> > I then get an endless loop in get_futex_key(), apparently because the
> > anonymous huge page containing the futex does not have a page->mapping.
> > A test case is provided in the above link.
> >
> > I reported the bug to Fedora bugzilla months ago, but haven't received
> > any feedback yet.
>
> No, it works much better if you simply mail LKML and CC people who work
> on the code in question ;-)
>
> > The Fedora kernel is based on 2.6.32.11, and a
> > cursory glance at the 2.6.34-rc3 source does not yield any relevant
> > change.
> >
> > So, could anyone tell me if the current mainline kernel might act better
> > in this respect, before I get around to compiling it?
>
> Right, so I had a quick chat with Mel, and it appears MAP_PRIVATE
> hugetlb pages don't have their page->mapping set.
>
> I guess something like the below might work, but I'd really rather not
> add hugetlb knowledge to futex.c. Does anybody else have a better idea?
> Maybe create something similar to an anon_vma for hugetlb pages?
>
anon_vma for hugetlb pages sounds overkill, what would it gain? In this
context, futex only appears to distinguish between whether the
references are private or shared.
Looking at the hugetlbfs code, I can't see a place where it actually cares
about the mapping as such. It's used to find shared pages in the page cache
(but not in the LRU) that are backed by the hugetlbfs file. For hugetlbfs
though, the mapping is mostly kept in page->private for reservation accounting
purposes.
I can't think of other parts of the VM that touch the mapping if the
page is managed by hugetlbfs so the following patch should also work but
without futex having hugetlbfs-awareness. What do you think? Maybe for
safety, it would be better to make the mapping some obvious poison bytes
or'd with PAGE_MAPPING_ANON so an oops will be more obvious?
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6034dc9..57a5faa 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -546,6 +546,7 @@ static void free_huge_page(struct page *page)
mapping = (struct address_space *) page_private(page);
set_page_private(page, 0);
+ page->mapping = NULL;
BUG_ON(page_count(page));
INIT_LIST_HEAD(&page->lru);
@@ -2447,8 +2448,10 @@ retry:
spin_lock(&inode->i_lock);
inode->i_blocks += blocks_per_huge_page(h);
spin_unlock(&inode->i_lock);
- } else
+ } else {
lock_page(page);
+ page->mapping = (struct address_space *)PAGE_MAPPING_ANON;
+ }
}
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists