[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201712072000.FCE30281.FOFHOOtVMQLJFS@I-love.SAKURA.ne.jp>
Date: Thu, 7 Dec 2017 20:00:58 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: mhocko@...nel.org
Cc: rientjes@...gle.com, akpm@...ux-foundation.org,
aarcange@...hat.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: Multiple oom_reaper BUGs: unmap_page_range racing with exit_mmap
Michal Hocko wrote:
> Hmm, so you are creating a separate process (from the signal point of
> view) and I suspect it is one of those that holds the last reference to
> the mm_struct which is released here and it has tsk_oom_victim = F
Right.
> So we need a more robust test for the oom victim. Your suggestion is
> basically what I came up with originally [1] and which was deemed
> ineffective because we took the mmap_sem even for regular paths and
> Kirill was afraid this adds some unnecessary cycles to the exit path
> which is quite hot.
>
> So I guess we have to do something else instead. We have to store the
> oom flag to the mm struct as well. Something like the patch below.
Yes, adding a new flag for this purpose will work.
Also, setting MMF_UNSTABLE flag between after sending SIGKILL and before
victim->mm becomes NULL and testing MMF_UNSTABLE at exit_mm() should work.
But I prefer simple revert + mmget()/mmput_async() approach at
http://lkml.kernel.org/r/201712062037.DAF90168.SVFQOJFMOOtHLF@I-love.SAKURA.ne.jp , for
my approach not only saves lines but also fixes unexpected change for nommu at
http://lkml.kernel.org/r/201711091949.BDB73475.OSHFOMQtLFOFVJ@I-love.SAKURA.ne.jp .
Also, if we replace asynchronous OOM reaping by the OOM reaper kernel thread with
synchronous OOM reaping by the OOM killer, we can close MMF_OOM_SKIP race window
because it is guaranteed that __oom_reap_task_mm() is called before __mmput() is
called.
Powered by blists - more mailing lists