lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 19 May 2009 13:09:32 +0800 From: Wu Fengguang <fengguang.wu@...el.com> To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> Cc: Christoph Lameter <cl@...ux-foundation.org>, Andrew Morton <akpm@...ux-foundation.org>, LKML <linux-kernel@...r.kernel.org>, Elladan <elladan@...imo.com>, Nick Piggin <npiggin@...e.de>, Johannes Weiner <hannes@...xchg.org>, Peter Zijlstra <peterz@...radead.org>, Rik van Riel <riel@...hat.com>, "tytso@....edu" <tytso@....edu>, "linux-mm@...ck.org" <linux-mm@...ck.org>, "minchan.kim@...il.com" <minchan.kim@...il.com> Subject: Re: [PATCH 2/3] vmscan: make mapped executable pages the first class citizen On Tue, May 19, 2009 at 12:41:38PM +0800, KOSAKI Motohiro wrote: > Hi > > Thanks for great works. > > > > SUMMARY > > ======= > > The patch decreases the number of major faults from 50 to 3 during 10% cache hot reads. > > > > > > SCENARIO > > ======== > > The test scenario is to do 100000 pread(size=110 pages, offset=(i*100) pages), > > where 10% of the pages will be activated: > > > > for i in `seq 0 100 10000000`; do echo $i 110; done > pattern-hot-10 > > iotrace.rb --load pattern-hot-10 --play /b/sparse > > > Which can I download iotrace.rb? In the attachment. It relies on some ruby libraries. > > and monitor /proc/vmstat during the time. The test box has 2G memory. > > > > > > ANALYZES > > ======== > > > > I carried out two runs on fresh booted console mode 2.6.29 with the VM_EXEC > > patch, and fetched the vmstat numbers on > > > > (1) begin: shortly after the big read IO starts; > > (2) end: just before the big read IO stops; > > (3) restore: the big read IO stops and the zsh working set restored > > > > nr_mapped nr_active_file nr_inactive_file pgmajfault pgdeactivate pgfree > > begin: 2481 2237 8694 630 0 574299 > > end: 275 231976 233914 633 776271 20933042 > > restore: 370 232154 234524 691 777183 20958453 > > > > begin: 2434 2237 8493 629 0 574195 > > end: 284 231970 233536 632 771918 20896129 > > restore: 399 232218 234789 690 774526 20957909 > > > > and another run on 2.6.30-rc4-mm with the VM_EXEC logic disabled: > > I don't think it is proper comparision. > you need either following comparision. otherwise we insert many guess into the analysis. > > - 2.6.29 with and without VM_EXEC patch > - 2.6.30-rc4-mm with and without VM_EXEC patch I think it doesn't matter that much when it comes to "relative" numbers. But anyway I guess you want to try a more typical desktop ;) Unfortunately currently the Xorg is broken in my test box.. > > > > begin: 2479 2344 9659 210 0 579643 > > end: 284 232010 234142 260 772776 20917184 > > restore: 379 232159 234371 301 774888 20967849 > > > > The numbers show that > > > > - The startup pgmajfault of 2.6.30-rc4-mm is merely 1/3 that of 2.6.29. > > I'd attribute that improvement to the mmap readahead improvements :-) > > > > - The pgmajfault increment during the file copy is 633-630=3 vs 260-210=50. > > That's a huge improvement - which means with the VM_EXEC protection logic, > > active mmap pages is pretty safe even under partially cache hot streaming IO. > > > > - when active:inactive file lru size reaches 1:1, their scan rates is 1:20.8 > > under 10% cache hot IO. (computed with formula Dpgdeactivate:Dpgfree) > > That roughly means the active mmap pages get 20.8 more chances to get > > re-referenced to stay in memory. > > > > - The absolute nr_mapped drops considerably to 1/9 during the big IO, and the > > dropped pages are mostly inactive ones. The patch has almost no impact in > > this aspect, that means it won't unnecessarily increase memory pressure. > > (In contrast, your 20% mmap protection ratio will keep them all, and > > therefore eliminate the extra 41 major faults to restore working set > > of zsh etc.) > > I'm surprised this. > Why your patch don't protect mapped page from streaming io? It is only protecting the *active* mapped pages, as expected. But yes, the active percent is much lower than expected :-) > I strongly hope reproduce myself, please teach me reproduce way. OK. Firstly: for i in `seq 0 100 10000000`; do echo $i 110; done > pattern-hot-10 dd if=/dev/zero of=/tmp/sparse bs=1M count=1 seek=1024000 Then boot into desktop and run concurrently: iotrace.rb --load pattern-hot-10 --play /tmp/sparse vmmon nr_mapped nr_active_file nr_inactive_file pgmajfault pgdeactivate pgfree Note that I was creating the sparse file in btrfs, which happens to be very slow in sparse file reading: 151.194384MB/s 284.198252s 100001x 450560b --load pattern-hot-10 --play /b/sparse In that case, the inactive list is rotated at the speed of 250MB/s, so a full scan of which takes about 3.5 seconds, while a full scan of active file list takes about 77 seconds. Attached source code for both of the above tools. Thanks, Fengguang Download attachment "iotrace.rb" of type "application/x-ruby" (8575 bytes) View attachment "vmmon.c" of type "text/x-csrc" (2411 bytes)
Powered by blists - more mailing lists