[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150824101755.GA2370@node.dhcp.inet.fi>
Date: Mon, 24 Aug 2015 13:17:56 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Dave Hansen <dave.hansen@...el.com>,
Vlastimil Babka <vbabka@...e.cz>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.cz>,
David Rientjes <rientjes@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCHv3 4/5] mm: make compound_head() robust
On Wed, Aug 19, 2015 at 12:21:45PM +0300, Kirill A. Shutemov wrote:
> Hugh has pointed that compound_head() call can be unsafe in some
> context. There's one example:
>
> CPU0 CPU1
>
> isolate_migratepages_block()
> page_count()
> compound_head()
> !!PageTail() == true
> put_page()
> tail->first_page = NULL
> head = tail->first_page
> alloc_pages(__GFP_COMP)
> prep_compound_page()
> tail->first_page = head
> __SetPageTail(p);
> !!PageTail() == true
> <head == NULL dereferencing>
>
> The race is pure theoretical. I don't it's possible to trigger it in
> practice. But who knows.
>
> We can fix the race by changing how encode PageTail() and compound_head()
> within struct page to be able to update them in one shot.
>
> The patch introduces page->compound_head into third double word block in
> front of compound_dtor and compound_order. That means it shares storage
> space with:
>
> - page->lru.next;
> - page->next;
> - page->rcu_head.next;
> - page->pmd_huge_pte;
>
> That's too long list to be absolutely sure, but looks like nobody uses
> bit 0 of the word. It can be used to encode PageTail(). And if the bit
> set, rest of the word is pointer to head page.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Acked-by: Michal Hocko <mhocko@...e.com>
> Cc: Hugh Dickins <hughd@...gle.com>
> Cc: David Rientjes <rientjes@...gle.com>
> Cc: Vlastimil Babka <vbabka@...e.cz>
If DEFERRED_STRUCT_PAGE_INIT=n, combining this patchset with my page-flags
patches causes oops in SetPageReserved() called from
reserve_bootmem_region().
It happens because we haven't yet initilized the word in struct page and
PageTail() inside SetPageReserved() can give false-positive, which leads
to bogus compound_head() result.
IIUC, we initialize the word only on first allocation of the page. It can
be too late: pfn scanner can see false-positive PageTail() from not yet
allocated pages too.
Here's fixlet for patch to address the issue.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 347724850665..d0e3fca830f8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -892,6 +892,8 @@ static void init_reserved_page(unsigned long pfn)
#else
static inline void init_reserved_page(unsigned long pfn)
{
+ /* Avoid false-positive PageTail() */
+ INIT_LIST_HEAD(&pfn_to_page(pfn)->lru);
}
#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists