[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160512134348.GK4200@dhcp22.suse.cz>
Date: Thu, 12 May 2016 15:43:50 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Rik van Riel <riel@...hat.com>,
David Rientjes <rientjes@...gle.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Johannes Weiner <hannes@...xchg.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC 05/13] mm, page_alloc: make THP-specific decisions more
generic
On Tue 10-05-16 09:35:55, Vlastimil Babka wrote:
> Since THP allocations during page faults can be costly, extra decisions are
> employed for them to avoid excessive reclaim and compaction, if the initial
> compaction doesn't look promising. The detection has never been perfect as
> there is no gfp flag specific to THP allocations. At this moment it checks the
> whole combination of flags that makes up GFP_TRANSHUGE, and hopes that no other
> users of such combination exist, or would mind being treated the same way.
> Extra care is also taken to separate allocations from khugepaged, where latency
> doesn't matter that much.
>
> It is however possible to distinguish these allocations in a simpler and more
> reliable way. The key observation is that after the initial compaction followed
> by the first iteration of "standard" reclaim/compaction, both __GFP_NORETRY
> allocations and costly allocations without __GFP_REPEAT are declared as
> failures:
>
> /* Do not loop if specifically requested */
> if (gfp_mask & __GFP_NORETRY)
> goto nopage;
>
> /*
> * Do not retry costly high order allocations unless they are
> * __GFP_REPEAT
> */
> if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
> goto nopage;
>
> This means we can further distinguish allocations that are costly order *and*
> additionally include the __GFP_NORETRY flag. As it happens, GFP_TRANSHUGE
> allocations do already fall into this category. This will also allow other
> costly allocations with similar high-order benefit vs latency considerations to
> use this semantic. Furthermore, we can distinguish THP allocations that should
> try a bit harder (such as from khugepageed) by removing __GFP_NORETRY, as will
> be done in the next patch.
Yes, using __GFP_NORETRY makes perfect sense. It is the weakest mode for
the costly allocation which includes both compaction and reclaim. I am
happy to see is_thp_gfp_mask going away.
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
Acked-by: Michal Hocko <mhocko@...e.com>
> ---
> mm/page_alloc.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 88d680b3e7b6..f5d931e0854a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3182,7 +3182,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> return page;
> }
>
> -
> /*
> * Maximum number of compaction retries wit a progress before OOM
> * killer is consider as the only way to move forward.
> @@ -3447,11 +3446,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
> return !!(gfp_to_alloc_flags(gfp_mask) & ALLOC_NO_WATERMARKS);
> }
>
> -static inline bool is_thp_gfp_mask(gfp_t gfp_mask)
> -{
> - return (gfp_mask & (GFP_TRANSHUGE | __GFP_KSWAPD_RECLAIM)) == GFP_TRANSHUGE;
> -}
> -
> /*
> * Maximum number of reclaim retries without any progress before OOM killer
> * is consider as the only way to move forward.
> @@ -3610,8 +3604,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> if (page)
> goto got_pg;
>
> - /* Checks for THP-specific high-order allocations */
> - if (is_thp_gfp_mask(gfp_mask)) {
> + /*
> + * Checks for costly allocations with __GFP_NORETRY, which
> + * includes THP page fault allocations
> + */
> + if (gfp_mask & __GFP_NORETRY) {
> /*
> * If compaction is deferred for high-order allocations,
> * it is because sync compaction recently failed. If
> @@ -3631,11 +3628,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> goto nopage;
>
> /*
> - * It can become very expensive to allocate transparent
> - * hugepages at fault, so use asynchronous memory
> - * compaction for THP unless it is khugepaged trying to
> - * collapse. All other requests should tolerate at
> - * least light sync migration.
> + * Looks like reclaim/compaction is worth trying, but
> + * sync compaction could be very expensive, so keep
> + * using async compaction, unless it's khugepaged
> + * trying to collapse.
> */
> if (!(current->flags & PF_KTHREAD))
> migration_mode = MIGRATE_ASYNC;
> --
> 2.8.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists