[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201609291802.GFG81203.FLHtOMSJOVFFQO@I-love.SAKURA.ne.jp>
Date: Thu, 29 Sep 2016 18:02:44 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: mhocko@...nel.org, akpm@...ux-foundation.org
Cc: hannes@...xchg.org, mgorman@...e.de, dave.hansen@...el.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, mhocko@...e.com
Subject: Re: [PATCH 2/2] mm: warn about allocations which stall for too long
Michal Hocko wrote:
> From: Michal Hocko <mhocko@...e.com>
>
> Currently we do warn only about allocation failures but small
> allocations are basically nofail and they might loop in the page
> allocator for a long time. Especially when the reclaim cannot make
> any progress - e.g. GFP_NOFS cannot invoke the oom killer and rely on
> a different context to make a forward progress in case there is a lot
> memory used by filesystems.
>
> Give us at least a clue when something like this happens and warn about
> allocations which take more than 10s. Print the basic allocation context
> information along with the cumulative time spent in the allocation as
> well as the allocation stack. Repeat the warning after every 10 seconds so
> that we know that the problem is permanent rather than ephemeral.
>
> Signed-off-by: Michal Hocko <mhocko@...e.com>
> ---
> mm/page_alloc.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 969ffc97045b..73f60ad6315f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3495,6 +3495,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> enum compact_result compact_result;
> int compaction_retries = 0;
> int no_progress_loops = 0;
> + unsigned long alloc_start = jiffies;
> + unsigned int stall_timeout = 10 * HZ;
>
> /*
> * In the slowpath, we sanity check order to avoid ever trying to
> @@ -3650,6 +3652,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
> goto nopage;
>
> + /* Make sure we know about allocations which stall for too long */
> + if (time_after(jiffies, alloc_start + stall_timeout)) {
> + warn_alloc(gfp_mask,
I expect "gfp_mask & ~__GFP_NOWARN" rather than "gfp_mask" here.
Otherwise, we can't get a clue for __GFP_NOWARN allocations.
> + "page alloction stalls for %ums, order:%u\n",
> + jiffies_to_msecs(jiffies-alloc_start), order);
> + stall_timeout += 10 * HZ;
> + }
> +
> if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
> did_some_progress > 0, &no_progress_loops))
> goto retry;
> --
> 2.9.3
Powered by blists - more mailing lists