[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201601290726.GGC12497.OSQJVtMFFOHOLF@I-love.SAKURA.ne.jp>
Date: Fri, 29 Jan 2016 07:26:39 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: mhocko@...nel.org
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, mgorman@...e.de,
rientjes@...gle.com, torvalds@...ux-foundation.org,
oleg@...hat.com, hughd@...gle.com, andrea@...nel.org,
riel@...hat.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/2] oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space
Michal Hocko wrote:
> On Thu 28-01-16 20:24:36, Tetsuo Handa wrote:
> [...]
> > I like the OOM reaper approach but I can't agree on merging the OOM reaper
> > without providing a guaranteed last resort at the same time. If you do want
> > to start the OOM reaper as simple as possible (without being bothered by
> > a lot of possible corner cases), please pursue a guaranteed last resort
> > at the same time.
>
> I am getting tired of this level of argumentation. oom_reaper in its
> current form is a step forward. I have acknowledged there are possible
> improvements doable on top but I do not see them necessary for the core
> part being merged. I am not trying to rush this in because I am very
> well aware of how subtle and complex all the interactions might be.
> So please stop your "we must have it all at once" attitude. This is
> nothing we have to rush in. We are not talking about a regression which
> has to be absolutely fixed in few days.
I'm not asking you to merge a perfect version of oom_reaper from the
beginning. I know it is too difficult. Instead, I'm asking you to allow
using timeout based approaches (shown below) as temporarily workaround
because there are environments which cannot wait for oom_reaper to become
enough reliable. Would you please reply to the thread which proposed a
guaranteed last resort (shown below)?
Tetsuo Handa wrote:
> I consider phases for managing system-wide OOM events as follows.
>
> (1) Design and use a system with appropriate memory capacity in mind.
>
> (2) When (1) failed, the OOM killer is invoked. The OOM killer selects
> an OOM victim and allow that victim access to memory reserves by
> setting TIF_MEMDIE to it.
>
> (3) When (2) did not solve the OOM condition, start allowing all tasks
> access to memory reserves by your approach.
>
> (4) When (3) did not solve the OOM condition, start selecting more OOM
> victims by my approach.
>
> (5) When (4) did not solve the OOM condition, trigger the kernel panic.
>
Powered by blists - more mailing lists