[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54F469C1.9090601@suse.cz>
Date: Mon, 02 Mar 2015 14:46:41 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
CC: Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Johannes Weiner <hannes@...xchg.org>,
Mel Gorman <mgorman@...e.de>,
Pravin Shelar <pshelar@...ira.com>,
Jarno Rajahalme <jrajahalme@...ira.com>,
Li Zefan <lizefan@...wei.com>,
Greg Thelen <gthelen@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, netdev@...r.kernel.org,
cgroups@...r.kernel.org, dev@...nvswitch.org
Subject: Re: [patch v2 1/3] mm: remove GFP_THISNODE
On 02/27/2015 11:16 PM, David Rientjes wrote:
> NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE.
>
> GFP_THISNODE is a secret combination of gfp bits that have different
> behavior than expected. It is a combination of __GFP_THISNODE,
> __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page allocator
> slowpath to fail without trying reclaim even though it may be used in
> combination with __GFP_WAIT.
>
> An example of the problem this creates: commit e97ca8e5b864 ("mm: fix
> GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE
> that really just wanted __GFP_THISNODE. The problem doesn't end there,
> however, because even it was a no-op for alloc_misplaced_dst_page(),
> which also sets __GFP_NORETRY and __GFP_NOWARN, and
> migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT
> is set in GFP_TRANSHUGE. Converting GFP_THISNODE to __GFP_THISNODE is
> a no-op in these cases since the page allocator special-cases
> __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN.
>
> It's time to just remove GFP_THISNODE entirely. We leave __GFP_THISNODE
> to restrict an allocation to a local node, but remove GFP_THISNODE and
> its obscurity. Instead, we require that a caller clear __GFP_WAIT if it
> wants to avoid reclaim.
>
> This allows the aforementioned functions to actually reclaim as they
> should. It also enables any future callers that want to do
> __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim. The
> rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT.
>
> Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so
> it is unchanged.
>
> Signed-off-by: David Rientjes <rientjes@...gle.com>
Acked-by: Vlastimil Babka <vbabka@...e.cz>
So you've convinced me that this is a non-functional change for slab and
a prerequisity for patch 2/3 which is the main point of this series for
4.0. That said, the new 'goto nopage' condition is still triggered by a
combination of flag states, and the less we have those, the better for
us IMHO.
Looking at commit 952f3b51be which introduced this particular check
using GFP_THISNODE, it seemed like it was a workaround to avoid
triggering reclaim, without needing a new gfp flag. Nowadays, we have
such flag called __GFP_NO_KSWAPD and as I explained in my reply to v1
(where I missed the new condition), passing the flag would already
prevent wake_all_kswapds() and treating the allocation as atomic in
gfp_to_alloc_flags(). So the whole difference would be another
get_page_from_freelist() attempt (possibly less constrained than the
fast path one) and then bail out on !wait.
So it would be IMHO better for longer-term maintainability to have the
relevant __GFP_THISNODE callers pass also __GFP_NO_KSWAPD to denote
these opportunistic allocation attempts, instead of having this subtle
semantic difference attached to __GFP_THISNODE && !__GFP_WAIT. It would
be probably too risky for 4.0. On the other hand, I don't think even
this series is really urgent. It's true that patch 2/3 fixes two
commits, including a 4.0 one, but those commits are already not
regressions without the fix. But maybe current -rcX is low enough to
proceed with this series and catch any regressions in time, allowing the
larger cleanups later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists