[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111216160728.GI3487@suse.de>
Date: Fri, 16 Dec 2011 16:07:28 +0000
From: Mel Gorman <mgorman@...e.de>
To: Johannes Weiner <jweiner@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Minchan Kim <minchan.kim@...il.com>,
Dave Jones <davej@...hat.com>, Jan Kara <jack@...e.cz>,
Andy Isaacson <adi@...apodia.org>,
Rik van Riel <riel@...hat.com>, Nai Xia <nai.xia@...il.com>,
Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 11/11] mm: Isolate pages for immediate reclaim on their
own LRU
On Fri, Dec 16, 2011 at 04:17:31PM +0100, Johannes Weiner wrote:
> On Wed, Dec 14, 2011 at 03:41:33PM +0000, Mel Gorman wrote:
> > It was observed that scan rates from direct reclaim during tests
> > writing to both fast and slow storage were extraordinarily high. The
> > problem was that while pages were being marked for immediate reclaim
> > when writeback completed, the same pages were being encountered over
> > and over again during LRU scanning.
> >
> > This patch isolates file-backed pages that are to be reclaimed when
> > clean on their own LRU list.
>
> Excuse me if I sound like a broken record, but have those observations
> of high scan rates persisted with the per-zone dirty limits patchset?
>
Unfortunately I wasn't testing that series. The focus of this series
was primarily on THP-related stalls incurred by compaction which
did not have a dependency on that series. Even with dirty balancing,
similar stalls would be observed once dirty pages were in the zone
at all.
> In my tests with pzd, the scan rates went down considerably together
> with the immediate reclaim / vmscan writes.
>
I probably should know but what is pzd?
> Our dirty limits are pretty low - if reclaim keeps shuffling through
> dirty pages, where are the 80% reclaimable pages?! To me, this sounds
> like the unfair distribution of dirty pages among zones again. Is
> there are a different explanation that I missed?
>
The alternative explanation is that the 20% dirty pages are all
long-lived, at the end of the highest zone which is always scanned first
so we continually have to scan over these dirty pages for prolonged
periods of time.
> PS: It also seems a bit out of place in this series...?
Without the last path, the System CPU time was stupidly high. In part,
this is because we are no longer calling ->writepage from direct
reclaim. If we were, the CPU usage would be far lower but it would
be a lot slower too. It seemed remiss to leave system CPU usage that
high without some explanation or patch dealing with it.
The following replaces this patch with your series. dirtybalance-v7r1 is
yours.
3.1.0-vanilla rc5-vanilla freemore-v6r1 isolate-v6r1 dirtybalance-v7r1
System Time 1.22 ( 0.00%) 13.89 (-1040.72%) 46.40 (-3709.20%) 4.44 ( -264.37%) 43.05 (-3434.81%)
+/- 0.06 ( 0.00%) 22.82 (-37635.56%) 3.84 (-6249.44%) 6.48 (-10618.92%) 4.04 (-6581.33%)
User Time 0.06 ( 0.00%) 0.06 ( -6.90%) 0.05 ( 17.24%) 0.05 ( 13.79%) 0.05 ( 20.69%)
+/- 0.01 ( 0.00%) 0.01 ( 33.33%) 0.01 ( 33.33%) 0.01 ( 39.14%) 0.01 ( -1.84%)
Elapsed Time 10445.54 ( 0.00%) 2249.92 ( 78.46%) 70.06 ( 99.33%) 16.59 ( 99.84%) 73.71 ( 99.29%)
+/- 643.98 ( 0.00%) 811.62 ( -26.03%) 10.02 ( 98.44%) 7.03 ( 98.91%) 17.90 ( 97.22%)
THP Active 15.60 ( 0.00%) 35.20 ( 225.64%) 65.00 ( 416.67%) 70.80 ( 453.85%) 102.60 ( 657.69%)
+/- 18.48 ( 0.00%) 51.29 ( 277.59%) 15.99 ( 86.52%) 37.91 ( 205.18%) 26.06 ( 141.02%)
Fault Alloc 121.80 ( 0.00%) 76.60 ( 62.89%) 155.40 ( 127.59%) 181.20 ( 148.77%) 214.80 ( 176.35%)
+/- 73.51 ( 0.00%) 61.11 ( 83.12%) 34.89 ( 47.46%) 31.88 ( 43.36%) 53.21 ( 72.39%)
Fault Fallback 881.20 ( 0.00%) 926.60 ( -5.15%) 847.60 ( 3.81%) 822.00 ( 6.72%) 788.40 ( 10.53%)
+/- 73.51 ( 0.00%) 61.26 ( 16.67%) 34.89 ( 52.54%) 31.65 ( 56.94%) 53.41 ( 27.35%)
MMTests Statistics: duration
User/Sys Time Running Test (seconds) 3540.88 1945.37 716.04 64.97 715.04
Total Elapsed Time (seconds) 52417.33 11425.90 501.02 230.95 549.64
Your series does help the System CPU time begining it from 46.4 seconds
to 43.05 seconds. That is within the noise but towards the edge of
one standard deviation. With such a small reduction, elapsed time was
not helped. However, it did help THP allocation success rates - still
within the noise but again at the edge of the noise which indicates
a solid improvement.
MMTests Statistics: vmstat
Page Ins 3257266139 1111844061 17263623 10901575 20870385
Page Outs 81054922 30364312 3626530 3657687 3665499
Swap Ins 3294 2851 6560 4964 6598
Swap Outs 390073 528094 620197 790912 604228
Direct pages scanned 1077581700 3024951463 1764930052 115140570 1796314840
Kswapd pages scanned 34826043 7112868 2131265 1686942 2093637
Kswapd pages reclaimed 28950067 4911036 1246044 966475 1319662
Direct pages reclaimed 805148398 280167837 3623473 2215044 4182274
Kswapd efficiency 83% 69% 58% 57% 63%
Kswapd velocity 664.399 622.521 4253.852 7304.360 3809.106
Direct efficiency 74% 9% 0% 1% 0%
Direct velocity 20557.737 264745.137 3522673.849 498551.938 3268166.145
Percentage direct scans 96% 99% 99% 98% 99%
Page writes by reclaim 722646 529174 620319 791018 604368
Page writes file 332573 1080 122 106 140
Page writes anon 390073 528094 620197 790912 604228
Page reclaim immediate 0 2552514720 1635858848 111281140 1661416934
Page rescued immediate 0 0 0 87848 0
Slabs scanned 23552 23552 9216 8192 8192
Direct inode steals 231 0 0 0 0
Kswapd inode steals 0 0 0 0 0
Kswapd skipped wait 28076 786 0 61 1
THP fault alloc 609 383 753 906 1074
THP collapse alloc 12 6 0 0 0
THP splits 536 211 456 593 561
THP fault fallback 4406 4633 4263 4110 3942
THP collapse fail 120 127 0 0 0
Compaction stalls 1810 728 623 779 869
Compaction success 196 53 60 80 99
Compaction failures 1614 675 563 699 770
Compaction pages moved 193158 53545 243185 333457 409585
Compaction move failure 9952 9396 16424 23676 30668
The direct page scanned figure with your patch is still very high
unfortunately.
Overall, I would say that your series is not a replacement for the last
patch in this series.
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists