[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1408041710070.23228@chino.kir.corp.google.com>
Date: Mon, 4 Aug 2014 17:18:42 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Johannes Weiner <hannes@...xchg.org>
cc: Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Rik van Riel <riel@...hat.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [patch 2/3] mm, oom: remove unnecessary check for NULL
zonelist
On Sat, 2 Aug 2014, Johannes Weiner wrote:
> > I see one concern: that panic_on_oom == 1 will not trigger on pagefault
> > when constrained by cpusets. To address that, I'll state that, since
> > cpuset-constrained allocations are the allocation context for pagefaults,
> > panic_on_oom == 1 should not trigger on pagefault when constrained by
> > cpusets.
>
> I expressed my concern pretty clearly above: out_of_memory() wants the
> zonelist that was used during the failed allocation, you are passing a
> non-sensical value in there that only happens to have the same type.
>
It's certainly meaningful, the particular zonelist chosen isn't important
because we don't care about the ordering and pagefaults are not going to
be using __GFP_THISNODE. In this context, we only need to pass a zonelist
that includes all zones because constrained_alloc() tests if the
allocation is cpuset-constrained based on the gfp flags. We'll get
CONSTRAINT_CPUSET in that case.
This is important because the behavior of panic_on_oom differs, as you
pointed out, depending on the constraint. pagefault_out_of_memory(), with
my patch, will always get CONSTRAINT_CPUSET when needed and
check_panic_on_oom() will behave correctly now for cpusets.
> We simply don't have the right information at the end of the page
> fault handler to respect constrained allocations. Case in point:
> nodemask is unset from pagefault_out_of_memory(), so we still kill
> based on mempolicy even though check_panic_on_oom() says it wouldn't.
>
That is, in fact, the only last bit of information we need in the
pagefault handler to make correct decisions. It's important, too, since
if the vma of the faulting address is constrained by a mempolicy, we want
to avoid needless killing a process that has a mempolicy with a disjoint
set of nodes.
> The code change is not an adequate solution for the problem we have
> here and the changelog is an insult to everybody who wants to make
> sense of this from the git history later on.
>
We can also address mempolicies by modifying the page fault handler and
passing the vma and faulting address to make the correct panic_on_oom
decisions but also filter processes that have mempolicies that consist
solely of a disjoint set of nodes. I'll post that patch series as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists