[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <55879AD2.30507@codeaurora.org>
Date: Mon, 22 Jun 2015 10:49:14 +0530
From: Susheel Khiani <skhiani@...eaurora.org>
To: Hugh Dickins <hughd@...gle.com>
CC: akpm@...ux-foundation.org, peterz@...radead.org, neilb@...e.de,
dhowells@...hat.com, paulmcquad@...il.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [Question] ksm: rmap_item pointing to some stale vmas
On 6/9/2015 11:56 PM, Susheel Khiani wrote:
> On 4/30/2015 11:37 AM, Susheel Khiani wrote:
>>> But if I've misunderstood, and you think that what you're seeing
>>> fits with the transient forking bugs I've (not quite) described,
>>> and you can explain why even the transient case is important for
>>> you to have fixed, then I really ought to redouble my efforts.
>>>
>>> Hugh
>
> I was able to root cause the issue as we got few instances of same and
> was frequently getting reproducible on stress tests. The reason why it
> was important was because failure to unmap ksm page was resulting into
> CMA allocation failure for us.
>
> For cases like fork, what we observed is for private mapped file pages,
> stable_node pointed by KSM page won't cover all the mappings until ksmd
> completes one full scan. Only after ksmd scan, new rmap_items pointing
> to mappings in child process would come into existence. So in cases like
> CMA allocations where we can't wait for ksmd to complete one full cycle,
> we can traverse anon_vma tree from parent's anon_vma to find out all the
> pages wheres CMA is mapped.
>
> I have tested the following patch on 3.10 kernel and with this change I
> am able to avoid CMA allocation failure which we were otherwise
> frequently seeing because of not able to unmap KSM page.
>
> Please review and let me know the feedback.
>
>
>
> [PATCH] ksm: Traverse through parent's anon_vma while unmapping
>
> While doing try_to_unmap_ksm, we traverse through
> rmap_item list to find out all the anon_vmas from which
> page needs to be unmapped.
>
> Now as per the design of KSM, it builds up its data
> structures by looking into each mm, and comes back a cycle
> later to find out which data structures are now outdated and
> needs to be updated. So, for cases like fork, what we
> observe is for private mapped file pages stable_node
> pointed by KSM page won't cover all the mappings until
> ksmd completes one full scan. Only after ksmd scan, new
> rmap_items pointing to mappings in child process would come
> into existence.
>
> As a result unmapping of a stable page can't be done until
> ksmd has completed one full scan. This becomes an issue in
> case of CMA where we need to unmap and move a CMA page and
> can't wait for ksmd to complete one cycle. Because of
> new rmap_items for new mapping still not created we won't be
> able to unmap CMA page from all the vmas where it is mapped.
> This would result in frequent CMA allocation failures.
>
> So instead of just relying on rmap_items list which we know
> can contain incomplete list, we also scan anon_vma tree from
> parent's anon_vma to find out all the vmas where CMA page is
> mapped and thereby successfully unmap the page and move it
> to new page.
>
> Change-Id: I97cacf6a73734b10c7098362c20fb3f2d4040c76
> Signed-off-by: Susheel Khiani <skhiani@...eaurora.org>
> ---
> mm/ksm.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 55 insertions(+), 3 deletions(-)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 11f6293..10d5266 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -1956,6 +1956,7 @@ int page_referenced_ksm(struct page *page, struct
> mem_cgroup *memcg,
> unsigned int mapcount = page_mapcount(page);
> int referenced = 0;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -1968,9 +1969,20 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -1999,6 +2011,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if (!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return referenced;
> }
> @@ -2010,6 +2027,7 @@ int try_to_unmap_ksm(struct page *page, enum
> ttu_flags flags,
> struct rmap_item *rmap_item;
> int ret = SWAP_AGAIN;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -2028,9 +2046,20 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -2056,6 +2085,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if(!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return ret;
> }
> @@ -2068,6 +2102,7 @@ int rmap_walk_ksm(struct page *page, int
> (*rmap_one)(struct page *,
> struct rmap_item *rmap_item;
> int ret = SWAP_AGAIN;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -2080,9 +2115,21 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
> +
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -2107,6 +2154,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if (!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return ret;
> }
Reminder Ping, did you get a chance to look into
the previous mail
--
Susheel Khiani
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists