[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170609140853.GA14760@cmpxchg.org>
Date: Fri, 9 Jun 2017 10:08:53 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <guro@...com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Vladimir Davydov <vdavydov.dev@...il.com>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the
#PF
On Thu, Jun 08, 2017 at 04:36:07PM +0200, Michal Hocko wrote:
> Does anybody see any problem with the patch or I can send it for the
> inclusion?
>
> On Fri 19-05-17 13:26:04, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@...e.com>
> >
> > Any allocation failure during the #PF path will return with VM_FAULT_OOM
> > which in turn results in pagefault_out_of_memory. This can happen for
> > 2 different reasons. a) Memcg is out of memory and we rely on
> > mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
> > normal allocation fails.
> >
> > The later is quite problematic because allocation paths already trigger
> > out_of_memory and the page allocator tries really hard to not fail
> > allocations. Anyway, if the OOM killer has been already invoked there
> > is no reason to invoke it again from the #PF path. Especially when the
> > OOM condition might be gone by that time and we have no way to find out
> > other than allocate.
> >
> > Moreover if the allocation failed and the OOM killer hasn't been
> > invoked then we are unlikely to do the right thing from the #PF context
> > because we have already lost the allocation context and restictions and
> > therefore might oom kill a task from a different NUMA domain.
> >
> > An allocation might fail also when the current task is the oom victim
> > and there are no memory reserves left and we should simply bail out
> > from the #PF rather than invoking out_of_memory.
> >
> > This all suggests that there is no legitimate reason to trigger
> > out_of_memory from pagefault_out_of_memory so drop it. Just to be sure
> > that no #PF path returns with VM_FAULT_OOM without allocation print a
> > warning that this is happening before we restart the #PF.
> >
> > Signed-off-by: Michal Hocko <mhocko@...e.com>
I don't agree with this patch.
The warning you replace the oom call with indicates that we never
expect a VM_FAULT_OOM to leak to this point. But should there be a
leak, it's infinitely better to tickle the OOM killer again - even if
that call is then fairly inaccurate and without alloc context - than
infinite re-invocations of the #PF when the VM_FAULT_OOM comes from a
context - existing or future - that isn't allowed to trigger the OOM.
I'm not a fan of defensive programming, but is this call to OOM more
expensive than the printk() somehow? And how certain are you that no
VM_FAULT_OOMs will leak, given how spread out page fault handlers and
how complex the different allocation contexts inside them are?
Powered by blists - more mailing lists