lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 26 Sep 2011 16:02:12 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Rik van Riel <riel@...hat.com>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH -mm] limit direct reclaim for higher order allocations

On Mon, Sep 26, 2011 at 09:55:07AM -0400, Rik van Riel wrote:
> When suffering from memory fragmentation due to unfreeable pages,
> THP page faults will repeatedly try to compact memory.  Due to
> the unfreeable pages, compaction fails.
> 
> Needless to say, at that point page reclaim also fails to create
> free contiguous 2MB areas.  However, that doesn't stop the current
> code from trying, over and over again, and freeing a minimum of
> 4MB (2UL << sc->order pages) at every single invocation.
> 
> This resulted in my 12GB system having 2-3GB free memory, a
> corresponding amount of used swap and very sluggish response times.
> 
> This can be avoided by having the direct reclaim code not reclaim
> from zones that already have plenty of free memory available for
> compaction.
> 
> If compaction still fails due to unmovable memory, doing additional
> reclaim will only hurt the system, not help.
> 
> Signed-off-by: Rik van Riel <riel@...hat.com>
> ---
> I believe Mel has another idea in mind on how to fix this issue. 
> I believe it will be good to compare both approaches side by side...
> 
>  mm/vmscan.c |   16 ++++++++++++++++
>  1 files changed, 16 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b7719ec..56811a1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2083,6 +2083,22 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
>  				continue;
>  			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
>  				continue;	/* Let kswapd poll it */
> +			if (COMPACTION_BUILD) {
> +				/*
> +				 * If we already have plenty of memory free
> +				 * for compaction, don't free any more.
> +				 */
> +				unsigned long balance_gap;
> +				balance_gap = min(low_wmark_pages(zone),
> +					(zone->present_pages +
> +					KSWAPD_ZONE_BALANCE_GAP_RATIO-1) /
> +					KSWAPD_ZONE_BALANCE_GAP_RATIO);
> +				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
> +					zone_watermark_ok_safe(zone, 0,
> +					high_wmark_pages(zone) + balance_gap +
> +					(2UL << sc->order), 0, 0))
> +					continue;
> +			}

I don't have a proper patch prepared but I think it is a mistake for
reclaim and compaction to be using different logic when deciding
if action should be taken. Compaction uses compaction_suitable()
and compaction_deferred() to decide whether it should compact or not
and reclaim/compaction should share the same logic. I don't have a
proper patch but the check would look something like;

                /*
                 * If reclaiming for THP, check if try_to_compact_pages
                 * would try and compact this zone or if compaction is deferred
                 * due to a recent failure. If these conditions are met,
                 * we should not reclaim more pages as the cost of reclaiming an
                 * excessive number of pages exceeds the benefit of using huge
                 * pages. If we are not reclaiming, pretend we have reclaimed
		 * pages so the caller bails.
                 */
                if ((sc->gfp_mask & __GFP_NO_KSWAPD) &&
                        (compaction_suitable(zone, sc->order) ||
                                compaction_deferred(zone))) {
			sc->nr_scanned = SWAP_CLUSTER_MAX;
			sc->nr_reclaimed = SWAP_CLUSTER_MAX;
			continue;
		}

compaction_suitable() takes into account the amount of free memory
so it is similar to your patch in that it takes into account "if we
already have plenty of memory free for compaction".

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ