lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 11 Nov 2010 12:15:46 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Michel Lespinasse <walken@...gle.com>
Cc:	Rik van Riel <riel@...hat.com>,
	Wu Fengguang <fengguang.wu@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>, linux-arch@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Ying Han <yinghan@...gle.com>,
	Andrea Arcangeli <aarcange@...hat.com>
Subject: [BUG] Commit d065bd81 severely regresses huge page allocation
	success rates

When testing 2.6.37-rc1, I noticed that huge page allocation success
rates were severely impaired. Bisection showed that commit [d065bd81: mm:
retry page fault when blocking on disk transfer] was the biggest factor.
Reverting the patch confirmed this. Here are the results of a high-order
allocation stress test. The vanilla kernel is 2.6.37-rc1 and the revert
kernel has this commit removed with minor conflicts cleaned up.

STRESS-HIGHALLOC
             highalloc-vanilla   revert-d065bd81
Pass 1           7.00 ( 0.00%)    73.00 (66.00%)
Pass 2           7.00 ( 0.00%)    92.00 (85.00%)
At Rest         13.00 ( 0.00%)    93.00 (80.00%)

The "pass 1" and "pass 2" are allocation attempts while the machine is
under heavy load. One might expect that allocations fail there but when the
machine is fully at rest, all memory freed and nothing else is going on,
the pages still cannot be allocated.

I had ftrace enabled and found this.

FTrace Reclaim Statistics: vmscan
             				   vanilla revert-d065bd81
Direct reclaims                               3687        889
Direct reclaim pages scanned              39767013     195182
Direct reclaim pages reclaimed              115079     107891
Direct reclaim write file async I/O          13598       5777
Direct reclaim write anon async I/O          70886      40954
Direct reclaim write file sync I/O               0          0
Direct reclaim write anon sync I/O              37        178
Wake kswapd requests                          6508        868
Kswapd wakeups                                1291        521
Kswapd pages scanned                      77859381    3240330
Kswapd pages reclaimed                     2548099    1965881
Kswapd reclaim write file async I/O          51266      56838
Kswapd reclaim write anon async I/O         935070     392199
Kswapd reclaim write file sync I/O               0          0
Kswapd reclaim write anon sync I/O               0          0
Time stalled direct reclaim (seconds)      1160.57     636.24
Time kswapd awake (seconds)                1453.81     654.25

Total pages scanned                      117626394   3435512
Total pages reclaimed                      2663178   2073772
%age total pages scanned/reclaimed           2.26%    60.36%
%age total pages scanned/written             0.91%    14.44%
%age  file pages scanned/written             0.06%     1.82%
Percentage Time Spent Direct Reclaim        25.92%    15.98%
Percentage Time kswapd Awake                65.57%    36.01%

Reverting the commit improves overall reclaim behaviour when allocating huge
pages. Note in particular the low percentage for scanned/reclaimed in the
vanilla kernel which implies the vanilla kernel is endlessly scans pages it
cannot reclaim. I also note that with the vanilla kernel that nr_inactive_*
remains high but when the patch is reverted, it drops implying that the
patch is preventing pages being reclaimed.

It does not make a difference if compaction is used - the figures are
still brutal.

I have not digested what the patch is doing but am reporting it in case
people familiar with the patch spot the problem quickly.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ