[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7cf16cfb-3190-dfbd-ce72-92a94d9277f5@linux.alibaba.com>
Date: Wed, 30 Jan 2019 09:47:19 -0800
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: John Hubbard <jhubbard@...dia.com>, ktkhai@...tuozzo.com,
hughd@...gle.com, aarcange@...hat.com, akpm@...ux-foundation.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [v3 PATCH] mm: ksm: do not block on page lock when searching
stable tree
On 1/29/19 11:14 PM, John Hubbard wrote:
> On 1/29/19 12:29 PM, Yang Shi wrote:
>> ksmd need search stable tree to look for the suitable KSM page, but the
>> KSM page might be locked for a while due to i.e. KSM page rmap walk.
>> Basically it is not a big deal since commit 2c653d0ee2ae
>> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
>> since max_page_sharing limits the number of shared KSM pages.
>>
>> But it still sounds not worth waiting for the lock, the page can be
>> skip,
>> then try to merge it in the next scan to avoid potential stall if its
>> content is still intact.
>>
>> Introduce trylock mode to get_ksm_page() to not block on page lock, like
>> what try_to_merge_one_page() does. And, define three possible
>> operations (nolock, lock and trylock) as enum type to avoid stacking up
>> bools and make the code more readable.
>>
>> Return -EBUSY if trylock fails, since NULL means not find suitable KSM
>> page, which is a valid case.
>>
>> With the default max_page_sharing setting (256), there is almost no
>> observed change comparing lock vs trylock.
>>
>> However, with ksm02 of LTP, the reduced ksmd full scan time can be
>> observed, which has set max_page_sharing to 786432. With lock version,
>> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
>> may take 8s - 11s to run two full scans. And, the number of
>> pages_sharing and pages_to_scan keep same. Basically, this change has
>> no harm >
>> Cc: Hugh Dickins <hughd@...gle.com>
>> Cc: Andrea Arcangeli <aarcange@...hat.com>
>> Suggested-by: John Hubbard <jhubbard@...dia.com>
>> Reviewed-by: Kirill Tkhai <ktkhai@...tuozzo.com>
>> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
>> ---
>> Hi folks,
>>
>> This patch was with "mm: vmscan: skip KSM page in direct reclaim if
>> priority
>> is low" in the initial submission. Then Hugh and Andrea pointed out
>> commit
>> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page
>> deduplication
>> limit") is good enough for limiting the number of shared KSM page to
>> prevent
>> from softlock when walking ksm page rmap. This commit does solve the
>> problem.
>> So, the series was dropped by Andrew from -mm tree.
>>
>> However, I thought the second patch (this one) still sounds useful.
>> So, I did
>> some test and resubmit it. The first version was reviewed by Krill
>> Tkhai, so
>> I keep his Reviewed-by tag since there is no change to the patch
>> except the
>> commit log.
>>
>> So, would you please reconsider this patch?
>>
>> v3: Use enum to define get_ksm_page operations (nolock, lock and
>> trylock) per
>> John Hubbard
>> v2: Updated the commit log to reflect some test result and latest
>> discussion
>>
>> mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 36 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 6c48ad1..5647bc1 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct
>> stable_node *stable_node)
>> free_stable_node(stable_node);
>> }
>> +enum get_ksm_page_flags {
>> + GET_KSM_PAGE_NOLOCK,
>> + GET_KSM_PAGE_LOCK,
>> + GET_KSM_PAGE_TRYLOCK
>> +};
>> +
>> /*
>> * get_ksm_page: checks if the page indicated by the stable node
>> * is still its ksm page, despite having held no reference to it.
>> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct
>> stable_node *stable_node)
>> * a page to put something that might look like our key in
>> page->mapping.
>> * is on its way to being freed; but it is an anomaly to bear in mind.
>> */
>> -static struct page *get_ksm_page(struct stable_node *stable_node,
>> bool lock_it)
>> +static struct page *get_ksm_page(struct stable_node *stable_node,
>> + enum get_ksm_page_flags flags)
>> {
>> struct page *page;
>> void *expected_mapping;
>> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct
>> stable_node *stable_node, bool lock_it)
>> goto stale;
>> }
>> - if (lock_it) {
>> + if (flags == GET_KSM_PAGE_TRYLOCK) {
>> + if (!trylock_page(page)) {
>> + put_page(page);
>> + return ERR_PTR(-EBUSY);
>> + }
>> + } else if (flags == GET_KSM_PAGE_LOCK)
>> lock_page(page);
>> +
>> + if (flags != GET_KSM_PAGE_NOLOCK) {
>> if (READ_ONCE(page->mapping) != expected_mapping) {
>> unlock_page(page);
>> put_page(page);
>> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct
>> rmap_item *rmap_item)
>> struct page *page;
>> stable_node = rmap_item->head;
>> - page = get_ksm_page(stable_node, true);
>> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>> if (!page)
>> goto out;
>> @@ -863,7 +877,7 @@ static int remove_stable_node(struct
>> stable_node *stable_node)
>> struct page *page;
>> int err;
>> - page = get_ksm_page(stable_node, true);
>> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>> if (!page) {
>> /*
>> * get_ksm_page did remove_node_from_stable_tree itself.
>> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct
>> stable_node **_stable_node_dup,
>> * stable_node parameter itself will be freed from
>> * under us if it returns NULL.
>> */
>> - _tree_page = get_ksm_page(dup, false);
>> + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
>> if (!_tree_page)
>> continue;
>> nr += 1;
>> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct
>> stable_node **_stable_node_dup,
>> if (!is_stable_node_chain(stable_node)) {
>> if (is_page_sharing_candidate(stable_node)) {
>> *_stable_node_dup = stable_node;
>> - return get_ksm_page(stable_node, false);
>> + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
>> }
>> /*
>> * _stable_node_dup set to NULL means the stable_node
>> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct
>> page *page)
>> * wrprotected at all times. Any will work
>> * fine to continue the walk.
>> */
>> - tree_page = get_ksm_page(stable_node_any, false);
>> + tree_page = get_ksm_page(stable_node_any,
>> + GET_KSM_PAGE_NOLOCK);
>> }
>> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>> if (!tree_page) {
>> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct
>> page *page)
>> * It would be more elegant to return stable_node
>> * than kpage, but that involves more changes.
>> */
>> - tree_page = get_ksm_page(stable_node_dup, true);
>> + tree_page = get_ksm_page(stable_node_dup,
>> + GET_KSM_PAGE_TRYLOCK);
>> +
>> + if (PTR_ERR(tree_page) == -EBUSY)
>> + return ERR_PTR(-EBUSY);
>
> or just:
>
> if (PTR_ERR(tree_page) == -EBUSY)
> return tree_page;
>
> right?
Either looks fine to me. Returning errno may look more explicit? Anyway
I really don't have preference.
>
>> +
>> if (unlikely(!tree_page))
>> /*
>> * The tree may have been rebalanced,
>> @@ -1842,7 +1862,8 @@ static struct stable_node
>> *stable_tree_insert(struct page *kpage)
>> * wrprotected at all times. Any will work
>> * fine to continue the walk.
>> */
>> - tree_page = get_ksm_page(stable_node_any, false);
>> + tree_page = get_ksm_page(stable_node_any,
>> + GET_KSM_PAGE_NOLOCK);
>> }
>> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>> if (!tree_page) {
>> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page
>> *page, struct rmap_item *rmap_item)
>> /* We first start with searching the page inside the stable
>> tree */
>> kpage = stable_tree_search(page);
>> +
>> + if (PTR_ERR(kpage) == -EBUSY)
>> + return;
>> +
>> if (kpage == page && rmap_item->head == stable_node) {
>> put_page(kpage);
>> return;
>> @@ -2242,7 +2267,8 @@ static struct rmap_item
>> *scan_get_next_rmap_item(struct page **page)
>> list_for_each_entry_safe(stable_node, next,
>> &migrate_nodes, list) {
>> - page = get_ksm_page(stable_node, false);
>> + page = get_ksm_page(stable_node,
>> + GET_KSM_PAGE_NOLOCK);
>> if (page)
>> put_page(page);
>> cond_resched();
>>
>
> Hi Yang,
>
> The patch looks correct as far doing what it claims to do. I'll leave it
> to others to decide if a trylock-based approach is really what you want,
> for KSM scans. It seems reasonable from my very limited knowledge of
> KSM: there shouldn't be any cases where you really *need* to wait for
> a page lock, because the whole system is really sort of an optimization
> anyway.
Thanks!
Yang
>
>
> thanks,
Powered by blists - more mailing lists