[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ecb4f2d-724f-463f-961f-efba1bdb63d2@suse.cz>
Date: Thu, 30 Jun 2016 09:17:17 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, hughd@...gle.com,
mgorman@...hsingularity.net, minchan@...nel.org,
stable@...r.kernel.org
Subject: Re: [patch for-4.7] mm, compaction: prevent VM_BUG_ON when
terminating freeing scanner
On 06/29/2016 11:47 PM, David Rientjes wrote:
> It's possible to isolate some freepages in a pageblock and then fail
> split_free_page() due to the low watermark check. In this case, we hit
> VM_BUG_ON() because the freeing scanner terminated early without a
> contended lock or enough freepages.
>
> This should never have been a VM_BUG_ON() since it's not a fatal
> condition. It should have been a VM_WARN_ON() at best, or even handled
> gracefully.
>
> Regardless, we need to terminate anytime the full pageblock scan was not
> done. The logic belongs in isolate_freepages_block(), so handle its state
> gracefully by terminating the pageblock loop and making a note to restart
> at the same pageblock next time since it was not possible to complete the
> scan this time.
>
> Reported-by: Minchan Kim <minchan@...nel.org>
> Signed-off-by: David Rientjes <rientjes@...gle.com>
> ---
> Note: I really dislike the low watermark check in split_free_page() and
> consider it poor software engineering. The function should split a free
> page, nothing more. Terminating memory compaction because of a low
> watermark check when we're simply trying to migrate memory seems like an
> arbitrary heuristic. There was an objection to removing it in the first
> proposed patch, but I think we should really consider removing that
> check so this is simpler.
There's a patch changing it to min watermark (you were CC'd on the
series). We could argue whether it belongs to split_free_page() or some
wrapper of it, but I don't think removing it completely should be done.
If zone is struggling with order-0 pages, a functionality for making
higher-order pages shouldn't make it even worse. It's also not that
arbitrary, even if we succeeded the migration and created a high-order
page, the higher-order allocation would still fail due to watermark
checks. Worse, __compact_finished() would keep telling the compaction to
continue, creating an even longer lag, which is also against your recent
patches.
> mm/compaction.c | 37 +++++++++++++++----------------------
> 1 file changed, 15 insertions(+), 22 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1009,8 +1009,6 @@ static void isolate_freepages(struct compact_control *cc)
> block_end_pfn = block_start_pfn,
> block_start_pfn -= pageblock_nr_pages,
> isolate_start_pfn = block_start_pfn) {
> - unsigned long isolated;
> -
> /*
> * This can iterate a massively long zone without finding any
> * suitable migration targets, so periodically check if we need
> @@ -1034,36 +1032,31 @@ static void isolate_freepages(struct compact_control *cc)
> continue;
>
> /* Found a block suitable for isolating free pages from. */
> - isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> - block_end_pfn, freelist, false);
> - /* If isolation failed early, do not continue needlessly */
> - if (!isolated && isolate_start_pfn < block_end_pfn &&
> - cc->nr_migratepages > cc->nr_freepages)
> - break;
> + isolate_freepages_block(cc, &isolate_start_pfn, block_end_pfn,
> + freelist, false);
>
> /*
> - * If we isolated enough freepages, or aborted due to async
> - * compaction being contended, terminate the loop.
> - * Remember where the free scanner should restart next time,
> - * which is where isolate_freepages_block() left off.
> - * But if it scanned the whole pageblock, isolate_start_pfn
> - * now points at block_end_pfn, which is the start of the next
> - * pageblock.
> - * In that case we will however want to restart at the start
> - * of the previous pageblock.
> + * If we isolated enough freepages, or aborted due to lock
> + * contention, terminate.
> */
> if ((cc->nr_freepages >= cc->nr_migratepages)
> || cc->contended) {
> - if (isolate_start_pfn >= block_end_pfn)
> + if (isolate_start_pfn >= block_end_pfn) {
> + /*
> + * Restart at previous pageblock if more
> + * freepages can be isolated next time.
> + */
That's not as explanatory as before :/ oh well...
> isolate_start_pfn =
> block_start_pfn - pageblock_nr_pages;
> + }
> break;
> - } else {
> + } else if (isolate_start_pfn < block_end_pfn) {
> /*
> - * isolate_freepages_block() should not terminate
> - * prematurely unless contended, or isolated enough
> + * If isolation failed early, do not continue
> + * needlessly.
> */
> - VM_BUG_ON(isolate_start_pfn < block_end_pfn);
> + isolate_start_pfn = block_start_pfn;
Note that this reset shouldn't be in fact necessary - without it, next
attempt would restart exactly at the pfn that we failed to split due to
watermark checks. But not a big deal.
> + break;
> }
> }
>
>
Powered by blists - more mailing lists