linux-kernel - Re: [patch 3/5] mm: try to distribute dirty pages fairly across zones

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111028203944.GB20607@localhost>
Date:	Sat, 29 Oct 2011 04:39:44 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Johannes Weiner <jweiner@...hat.com>
Cc:	Michal Hocko <mhocko@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	Christoph Hellwig <hch@...radead.org>,
	Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.cz>,
	Rik van Riel <riel@...hat.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Chris Mason <chris.mason@...cle.com>,
	Theodore Ts'o <tytso@....edu>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	"xfs@....sgi.com" <xfs@....sgi.com>,
	"linux-btrfs@...r.kernel.org" <linux-btrfs@...r.kernel.org>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [patch 3/5] mm: try to distribute dirty pages fairly across
 zones

[restore CC list]

> > I'm trying to understand where the performance gain comes from.
> > 
> > I noticed that in all cases, before/after patchset, nr_vmscan_write are all zero.
> > 
> > nr_vmscan_immediate_reclaim is significantly reduced though:
> 
> That's a good thing, it means we burn less CPU time on skipping
> through dirty pages on the LRU.
> 
> Until a certain priority level, the dirty pages encountered on the LRU
> list are marked PageReclaim and put back on the list, this is the
> nr_vmscan_immediate_reclaim number.  And only below that priority, we
> actually ask the FS to write them, which is nr_vmscan_write.

Yes, it is.

> I suspect this is where the performance improvement comes from: we
> find clean pages for reclaim much faster.

That explains how it could reduce CPU overheads. However the dd's are
throttled anyway, so I still don't understand how the speedup of dd page
allocations improve the _IO_ performance.

> > $ ./compare.rb -g 1000M -e nr_vmscan_immediate_reclaim thresh*/*-ioless-full-nfs-wq5-next-20111014+ thresh*/*-ioless-full-per-zone-dirty-next-20111014+
> > 3.1.0-rc9-ioless-full-nfs-wq5-next-20111014+  3.1.0-rc9-ioless-full-per-zone-dirty-next-20111014+  
> > ------------------------  ------------------------  
> >                560289.00       -98.5%      8145.00  thresh=1000M/btrfs-100dd-4k-8p-4096M-1000M:10-X
> >                576882.00       -98.4%      9511.00  thresh=1000M/btrfs-10dd-4k-8p-4096M-1000M:10-X
> >                651258.00       -98.8%      7963.00  thresh=1000M/btrfs-1dd-4k-8p-4096M-1000M:10-X
> >               1963294.00       -85.4%    286815.00  thresh=1000M/ext3-100dd-4k-8p-4096M-1000M:10-X
> >               2108028.00       -10.6%   1885114.00  thresh=1000M/ext3-10dd-4k-8p-4096M-1000M:10-X
> >               2499456.00       -99.9%      2061.00  thresh=1000M/ext3-1dd-4k-8p-4096M-1000M:10-X
> >               2534868.00       -78.5%    545815.00  thresh=1000M/ext4-100dd-4k-8p-4096M-1000M:10-X
> >               2921668.00       -76.8%    677177.00  thresh=1000M/ext4-10dd-4k-8p-4096M-1000M:10-X
> >               2841049.00      -100.0%       779.00  thresh=1000M/ext4-1dd-4k-8p-4096M-1000M:10-X
> >               2481823.00       -86.3%    339342.00  thresh=1000M/xfs-100dd-4k-8p-4096M-1000M:10-X
> >               2508629.00       -87.4%    316614.00  thresh=1000M/xfs-10dd-4k-8p-4096M-1000M:10-X
> >               2656628.00      -100.0%       678.00  thresh=1000M/xfs-1dd-4k-8p-4096M-1000M:10-X
> >              24303872.00       -83.2%   4080014.00  TOTAL nr_vmscan_immediate_reclaim
> > 
> > If you'd like to compare any other vmstat items before/after patch,
> > let me know and I'll run the compare script to find them out.
> 
> I will come back to you on this, so tired right now.  But I find your
> scripts interesting ;-) Are those released and available for download
> somewhere?  I suspect every kernel hacker has their own collection of
> scripts to process data like this, maybe we should pull them all
> together and put them into a git tree!

Thank you for the interest :-)

I used to upload my writeback test scripts to kernel.org. However its
file service is not restored yet. So I attach the compare script here.
It's a bit hacky for now, which I hope can be improved over time to be
useful to other projects as well.

Thanks,
Fengguang

Download attachment "compare.rb" of type "application/x-ruby" (6772 bytes)