[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55785B5E.3000306@suse.cz>
Date: Wed, 10 Jun 2015 17:44:30 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Hugh Dickins <hughd@...gle.com>
CC: Dave Hansen <dave.hansen@...el.com>, Mel Gorman <mgorman@...e.de>,
Rik van Riel <riel@...hat.com>,
Christoph Lameter <cl@...two.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Steve Capper <steve.capper@...aro.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.cz>,
Jerome Marchand <jmarchan@...hat.com>,
Sasha Levin <sasha.levin@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCHv6 32/36] thp: reintroduce split_huge_page()
On 06/03/2015 07:06 PM, Kirill A. Shutemov wrote:
> This patch adds implementation of split_huge_page() for new
> refcountings.
>
> Unlike previous implementation, new split_huge_page() can fail if
> somebody holds GUP pin on the page. It also means that pin on page
> would prevent it from bening split under you. It makes situation in
> many places much cleaner.
>
> The basic scheme of split_huge_page():
>
> - Check that sum of mapcounts of all subpage is equal to page_count()
> plus one (caller pin). Foll off with -EBUSY. This way we can avoid
> useless PMD-splits.
>
> - Freeze the page counters by splitting all PMD and setup migration
> PTEs.
>
> - Re-check sum of mapcounts against page_count(). Page's counts are
> stable now. -EBUSY if page is pinned.
>
> - Split compound page.
>
> - Unfreeze the page by removing migration entries.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Tested-by: Sasha Levin <sasha.levin@...cle.com>
[...]
> +
> +static int __split_huge_page_tail(struct page *head, int tail,
> + struct lruvec *lruvec, struct list_head *list)
> +{
> + int mapcount;
> + struct page *page_tail = head + tail;
> +
> + mapcount = page_mapcount(page_tail);
Isn't page_mapcount() unnecessarily heavyweight here? When you are
splitting a page, it already should have zero compound_mapcount() and
shouldn't be PageDoubleMap(), no? So you should care about
page->_mapcount only? Sure, splitting THP is not a hotpath, but when
done 512 times per split, it could make some difference in the split's
latency.
> + VM_BUG_ON_PAGE(atomic_read(&page_tail->_count) != 0, page_tail);
> +
> + /*
> + * tail_page->_count is zero and not changing from under us. But
> + * get_page_unless_zero() may be running from under us on the
> + * tail_page. If we used atomic_set() below instead of atomic_add(), we
> + * would then run atomic_set() concurrently with
> + * get_page_unless_zero(), and atomic_set() is implemented in C not
> + * using locked ops. spin_unlock on x86 sometime uses locked ops
> + * because of PPro errata 66, 92, so unless somebody can guarantee
> + * atomic_set() here would be safe on all archs (and not only on x86),
> + * it's safer to use atomic_add().
I would be surprised if this was the first place to use atomic_set()
with potential concurrent atomic_add(). Shouldn't atomic_*() API
guarantee that this works?
> + */
> + atomic_add(page_mapcount(page_tail) + 1, &page_tail->_count);
You already have the value in mapcount variable, so why read it again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists