[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20171206085027.GD16386@dhcp22.suse.cz>
Date:   Wed, 6 Dec 2017 09:50:27 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     David Rientjes <rientjes@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Multiple oom_reaper BUGs: unmap_page_range racing with exit_mmap
On Wed 06-12-17 12:28:53, Tetsuo Handa wrote:
> David Rientjes wrote:
> > On Tue, 5 Dec 2017, David Rientjes wrote:
> > 
> > > One way to solve the issue is to have two mm flags: one to indicate the mm 
> > > is entering unmap_vmas(): set the flag, do down_write(&mm->mmap_sem); 
> > > up_write(&mm->mmap_sem), then unmap_vmas().  The oom reaper needs this 
> > > flag clear, not MMF_OOM_SKIP, while holding down_read(&mm->mmap_sem) to be 
> > > allowed to call unmap_page_range().  The oom killer will still defer 
> > > selecting this victim for MMF_OOM_SKIP after unmap_vmas() returns.
> > > 
> > > The result of that change would be that we do not oom reap from any mm 
> > > entering unmap_vmas(): we let unmap_vmas() do the work itself and avoid 
> > > racing with it.
> > > 
> > 
> > I think we need something like the following?
> 
> This patch does not work. __oom_reap_task_mm() can find MMF_REAPING and
> return true and sets MMF_OOM_SKIP before exit_mmap() calls down_write().
> 
> Also, I don't know what exit_mmap() is doing but I think that there is a
> possibility that the OOM reaper tries to reclaim mlocked pages as soon as
> exit_mmap() cleared VM_LOCKED flag by calling munlock_vma_pages_all().
> 
> 	if (mm->locked_vm) {
> 		vma = mm->mmap;
> 		while (vma) {
> 			if (vma->vm_flags & VM_LOCKED)
> 				munlock_vma_pages_all(vma);
> 			vma = vma->vm_next;
> 		}
> 	}
I do not really see, why this would matter. munlock_vma_pages_all is
mostly about accounting and clearing the per-page state. It relies on
follow_page which crawls page tables and unmap_page_range clears ptes
under the lock which is taken when resolving a locked page as well.
I still have to think about all the consequences when we are effectively
reaping VM_LOCKED vmas - I suspect we can do some misaccounting but I
yet do not see how this could lead to crashes. Maybe we can move
VM_LOCKED clearing _after_ the munlock bussiness is done but this is
really hard to tell before I re-read the mlock code more throughly.
-- 
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists
 
