lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20110218122203.GA13246@csn.ul.ie>
Date:	Fri, 18 Feb 2011 12:22:03 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Michal Hocko <mhocko@...e.cz>,
	Kent Overstreet <kent.overstreet@...il.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: vmscan: Stop reclaim/compaction earlier due to
	insufficient progress if !__GFP_REPEAT

On Thu, Feb 17, 2011 at 02:22:09PM -0800, Andrew Morton wrote:
> On Wed, 16 Feb 2011 09:50:49 +0000
> Mel Gorman <mel@....ul.ie> wrote:
> 
> > should_continue_reclaim() for reclaim/compaction allows scanning to continue
> > even if pages are not being reclaimed until the full list is scanned. In
> > terms of allocation success, this makes sense but potentially it introduces
> > unwanted latency for high-order allocations such as transparent hugepages
> > and network jumbo frames that would prefer to fail the allocation attempt
> > and fallback to order-0 pages.  Worse, there is a potential that the full
> > LRU scan will clear all the young bits, distort page aging information and
> > potentially push pages into swap that would have otherwise remained resident.
> 
> afaict the patch affects order-0 allocations as well.  What are the
> implications of this?
> 

order-0 allocation should not be affected because RECLAIM_MODE_COMPACTION
is not set so the following avoids the gfp_mask being examined;

        if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION))
                return false;

> Also, what might be the downsides of this change, and did you test for
> them?
> 

The main downside that I predict is that the worst-case latencies for
successful transparent hugepage allocations will be increased as there will
be more looping in do_try_to_free_pages() at higher priorities. I would also
not be surprised if there were fewer successful allocations.

Latencies did seem to be worse for order-9 allocations in testing but it was
offset by lower latencies for lower orders and seemed an acceptable trade-off.

Other major consequences did not spring to mind.

> > This patch will stop reclaim/compaction if no pages were reclaimed in the
> > last SWAP_CLUSTER_MAX pages that were considered.
> 
> a) Why SWAP_CLUSTER_MAX?  Is (SWAP_CLUSTER_MAX+7) better or worse?
> 

SWAP_CLUSTER_MAX is the standard "unit of reclaim" and that's what I had
in mind when writing the comment but it's wrong and misleading. More on
this below.

> b) The sentence doesn't seem even vaguely accurate.  shrink_zone()
>    will scan vastly more than SWAP_CLUSTER_MAX pages before calling
>    should_continue_reclaim().  Confused.
> 
> c) The patch doesn't "stop reclaim/compaction" fully.  It stops it
>    against one zone.  reclaim will then advance on to any other
>    eligible zones.

You're right on both counts and this comment is inaccurate. It should
have read;

This patch will stop reclaim/compaction for the current zone in shrink_zone()
if there were no pages reclaimed in the last batch of scanning at the
current priority.  For allocations such as hugetlbfs that use __GFP_REPEAT
and have fewer fallback options, the full LRU list may still be scanned.

The comment in the code itself then becomes

+               /*
+                * For non-__GFP_REPEAT allocations which can presumably
+                * fail without consequence, stop if we failed to reclaim
+                * any pages from the last batch of pages that were scanned.
+                * This will return to the caller faster at the risk that
+                * reclaim/compaction and the resulting allocation attempt
+                * fails
+                */

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ