linux-kernel - Re: [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130321140154.GL6094@dhcp22.suse.cz>
Date:	Thu, 21 Mar 2013 15:01:54 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Mel Gorman <mgorman@...e.de>
Cc:	Linux-MM <linux-mm@...ck.org>, Jiri Slaby <jslaby@...e.cz>,
	Valdis Kletnieks <Valdis.Kletnieks@...edu>,
	Rik van Riel <riel@...hat.com>,
	Zlatko Calusic <zcalusic@...sync.net>,
	Johannes Weiner <hannes@...xchg.org>,
	dormando <dormando@...ia.net>,
	Satoru Moriya <satoru.moriya@....com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 02/10] mm: vmscan: Obey proportional scanning
 requirements for kswapd

On Sun 17-03-13 13:04:08, Mel Gorman wrote:
> Simplistically, the anon and file LRU lists are scanned proportionally
> depending on the value of vm.swappiness although there are other factors
> taken into account by get_scan_count().  The patch "mm: vmscan: Limit
> the number of pages kswapd reclaims" limits the number of pages kswapd
> reclaims but it breaks this proportional scanning and may evenly shrink
> anon/file LRUs regardless of vm.swappiness.
> 
> This patch preserves the proportional scanning and reclaim. It does mean
> that kswapd will reclaim more than requested but the number of pages will
> be related to the high watermark.
> 
> Signed-off-by: Mel Gorman <mgorman@...e.de>
> ---
>  mm/vmscan.c | 52 +++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 41 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4835a7a..182ff15 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1815,6 +1815,45 @@ out:
>  	}
>  }
>  
> +static void recalculate_scan_count(unsigned long nr_reclaimed,
> +		unsigned long nr_to_reclaim,
> +		unsigned long nr[NR_LRU_LISTS])
> +{
> +	enum lru_list l;
> +
> +	/*
> +	 * For direct reclaim, reclaim the number of pages requested. Less
> +	 * care is taken to ensure that scanning for each LRU is properly
> +	 * proportional. This is unfortunate and is improper aging but
> +	 * minimises the amount of time a process is stalled.
> +	 */
> +	if (!current_is_kswapd()) {
> +		if (nr_reclaimed >= nr_to_reclaim) {
> +			for_each_evictable_lru(l)
> +				nr[l] = 0;
> +		}
> +		return;

Heh, this is nicely cryptically said what could be done in shrink_lruvec
as
	if (!current_is_kswapd()) {
		if (nr_reclaimed >= nr_to_reclaim)
			break;
	}

Besides that this is not memcg aware which I think it would break
targeted reclaim which is kind of direct reclaim but it still would be
good to stay proportional because it starts with DEF_PRIORITY.

I would suggest moving this back to shrink_lruvec and update the test as
follows:
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 182ff15..5cf5a4b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1822,23 +1822,9 @@ static void recalculate_scan_count(unsigned long nr_reclaimed,
 	enum lru_list l;
 
 	/*
-	 * For direct reclaim, reclaim the number of pages requested. Less
-	 * care is taken to ensure that scanning for each LRU is properly
-	 * proportional. This is unfortunate and is improper aging but
-	 * minimises the amount of time a process is stalled.
-	 */
-	if (!current_is_kswapd()) {
-		if (nr_reclaimed >= nr_to_reclaim) {
-			for_each_evictable_lru(l)
-				nr[l] = 0;
-		}
-		return;
-	}
-
-	/*
-	 * For kswapd, reclaim at least the number of pages requested.
-	 * However, ensure that LRUs shrink by the proportion requested
-	 * by get_scan_count() so vm.swappiness is obeyed.
+	 * Reclaim at least the number of pages requested. However,
+	 * ensure that LRUs shrink by the proportion requested by
+	 * get_scan_count() so vm.swappiness is obeyed.
 	 */
 	if (nr_reclaimed >= nr_to_reclaim) {
 		unsigned long min = ULONG_MAX;
@@ -1881,6 +1867,18 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
 			}
 		}
 
+		/*
+		 * For global direct reclaim, reclaim the number of
+		 * pages requested. Less care is taken to ensure that
+		 * scanning for each LRU is properly proportional. This
+		 * is unfortunate and is improper aging but minimises
+		 * the amount of time a process is stalled.
+		 */
+		if (global_reclaim(sc) && !current_is_kswapd()) {
+			if (nr_reclaimed >= nr_to_reclaim)
+				break
+		}
+
 		recalculate_scan_count(nr_reclaimed, nr_to_reclaim, nr);
 	}
 	blk_finish_plug(&plug);

> +	}
> +
> +	/*
> +	 * For kswapd, reclaim at least the number of pages requested.
> +	 * However, ensure that LRUs shrink by the proportion requested
> +	 * by get_scan_count() so vm.swappiness is obeyed.
> +	 */
> +	if (nr_reclaimed >= nr_to_reclaim) {
> +		unsigned long min = ULONG_MAX;
> +
> +		/* Find the LRU with the fewest pages to reclaim */
> +		for_each_evictable_lru(l)
> +			if (nr[l] < min)
> +				min = nr[l];
> +
> +		/* Normalise the scan counts so kswapd scans proportionally */
> +		for_each_evictable_lru(l)
> +			nr[l] -= min;
> +	}

It looked scary at first glance but it makes sense. Every round (after we
have reclaimed enough) one LRU is pulled out and others are
proportionally inhibited.

> +}
> +
>  /*
>   * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
>   */
> @@ -1841,17 +1880,8 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
>  							    lruvec, sc);
>  			}
>  		}
> -		/*
> -		 * On large memory systems, scan >> priority can become
> -		 * really large. This is fine for the starting priority;
> -		 * we want to put equal scanning pressure on each zone.
> -		 * However, if the VM has a harder time of freeing pages,
> -		 * with multiple processes reclaiming pages, the total
> -		 * freeing target can get unreasonably large.
> -		 */
> -		if (nr_reclaimed >= nr_to_reclaim &&
> -		    sc->priority < DEF_PRIORITY)
> -			break;
> +
> +		recalculate_scan_count(nr_reclaimed, nr_to_reclaim, nr);
>  	}
>  	blk_finish_plug(&plug);
>  	sc->nr_reclaimed += nr_reclaimed;
> -- 
> 1.8.1.4
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/