[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170623123830.GW5308@dhcp22.suse.cz>
Date: Fri, 23 Jun 2017 14:38:30 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: rientjes@...gle.com, akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: Re: [patch] mm, oom: prevent additional oom kills before memory
is freed
On Sat 17-06-17 22:30:31, Tetsuo Handa wrote:
> Michal Hocko wrote:
[...]
> > What does this dissassemble to on your kernel? Care to post addr2line?
>
[...]
> The __oom_reap_task_mm+0xa1/0x160 is __oom_reap_task_mm at mm/oom_kill.c:472
> which is "struct vm_area_struct *vma;" line in __oom_reap_task_mm().
> The __oom_reap_task_mm+0xb1/0x160 is __oom_reap_task_mm at mm/oom_kill.c:519
> which is "if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED))" line.
> The <49> 8b 46 50 is "vma->vm_flags" in can_madv_dontneed_vma(vma) from __oom_reap_task_mm().
OK, I see what is going on here. I could have noticed earlier. Sorry my
fault. We are simply accessing a stale mm->mmap. exit_mmap() does
remove_vma which frees all the vmas but it doesn't reset mm->mmap to
NULL. Trivial to fix.
diff --git a/mm/mmap.c b/mm/mmap.c
index ca58f8a2a217..253808e716dc 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2979,6 +2979,7 @@ void exit_mmap(struct mm_struct *mm)
nr_accounted += vma_pages(vma);
vma = remove_vma(vma);
}
+ mm->mmap = NULL;
vm_unacct_memory(nr_accounted);
up_write(&mm->mmap_sem);
}
> Is it safe for the OOM reaper to call tlb_gather_mmu()/unmap_page_range()/tlb_finish_mmu() sequence
> after the OOM victim already completed tlb_gather_mmu()/unmap_vmas()/free_pgtables()/tlb_finish_mmu()/
> remove_vma() sequence from exit_mmap() from __mmput() from mmput() from exit_mm() from do_exit() ?
It is safe to race until unmap_vmas because that only needs mmap_sem for
read mode (e.g. madvise MADV_DONTNEED) and all the later operations have
to be linearized because we cannot tear down page tables while the oom
reaper is doing pte walk. After we drop mmap_sem for write in the
exit_mmap then there are no vmas and so there is nothing to do in the
reaper.
I will give the patch more testing next week. This one was busy as hell
(i was travelling and then the stack gap thingy...).
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists