[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1704111152170.25069@east.gentwo.org>
Date: Tue, 11 Apr 2017 12:24:25 -0500 (CDT)
From: Christoph Lameter <cl@...ux.com>
To: Vlastimil Babka <vbabka@...e.cz>
cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org, Li Zefan <lizefan@...wei.com>,
Michal Hocko <mhocko@...nel.org>,
Mel Gorman <mgorman@...hsingularity.net>,
David Rientjes <rientjes@...gle.com>,
Hugh Dickins <hughd@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race
with cpuset update
On Tue, 11 Apr 2017, Vlastimil Babka wrote:
> The root of the problem is that the cpuset's mems_allowed and mempolicy's
> nodemask can temporarily have no intersection, thus get_page_from_freelist()
> cannot find any usable zone. The current semantic for empty intersection is to
> ignore mempolicy's nodemask and honour cpuset restrictions. This is checked in
> node_zonelist(), but the racy update can happen after we already passed the
The fallback was only intended for a cpuset on which boundaries are not enforced
in critical conditions (softwall). A hardwall cpuset (CS_MEM_HARDWALL)
should fail the allocation.
> This patch fixes the issue by having __alloc_pages_slowpath() check for empty
> intersection of cpuset and ac->nodemask before OOM or allocation failure. If
> it's indeed empty, the nodemask is ignored and allocation retried, which mimics
> node_zonelist(). This works fine, because almost all callers of
Well that would need to be subject to the hardwall flag. Allocation needs
to fail for a hardwall cpuset.
Powered by blists - more mailing lists