[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190219023701.GA3223@xz-x1>
Date: Tue, 19 Feb 2019 10:37:01 +0800
From: Peter Xu <peterx@...hat.com>
To: Andrea Arcangeli <aarcange@...hat.com>
Cc: Jerome Glisse <jglisse@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <mawilcox@...rosoft.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
Michal Hocko <mhocko@...nel.org>, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 0/4] Restore change_pte optimization to its former
glory
On Mon, Feb 18, 2019 at 12:45:05PM -0500, Andrea Arcangeli wrote:
> On Mon, Feb 18, 2019 at 11:04:13AM -0500, Jerome Glisse wrote:
> > So i run 2 exact same VMs side by side (copy of same COW image) and
> > built the same kernel tree inside each (that is the only important
> > workload that exist ;)) but the change_pte did not have any impact:
> >
> > before mean {real: 1358.250977, user: 16650.880859, sys: 839.199524, npages: 76855.390625}
> > before stdev {real: 6.744010, user: 108.863762, sys: 6.840437, npages: 1868.071899}
> > after mean {real: 1357.833740, user: 16685.849609, sys: 839.646973, npages: 76210.601562}
> > after stdev {real: 5.124797, user: 78.469360, sys: 7.009164, npages: 2468.017578}
> > without mean {real: 1358.501343, user: 16674.478516, sys: 837.791992, npages: 76225.203125}
> > without stdev {real: 5.541104, user: 97.998367, sys: 6.715869, npages: 1682.392578}
> >
> > Above is time taken by make inside each VM for all yes config. npages
> > is the number of page shared reported on the host at the end of the
> > build.
>
> Did you set /sys/kernel/mm/ksm/sleep_millisecs to 0?
>
> It would also help to remove the checksum check from mm/ksm.c:
>
> - if (rmap_item->oldchecksum != checksum) {
> - rmap_item->oldchecksum = checksum;
> - return;
> - }
>
> One way or another, /sys/kernel/mm/ksm/pages_shared and/or
> pages_sharing need to change significantly to be sure we're exercising
> the COW/merging code that uses change_pte. KSM is smart enough to
> merge only not frequently changing pages, and with the default KSM
> code this probably works too well for a kernel build.
Would it also make sense to track how many pages are really affected
by change_pte (say, in kvm_set_pte_rmapp, count avaliable SPTEs that
are correctly rebuilt)? I'm thinking even if many pages are merged by
KSM it's still possible that these pages are not actively shadowed by
KVM MMU, meanwhile change_pte should only affect actively shadowed
SPTEs. In other words, IMHO we might not be able to observe obvious
performance differeneces if the pages we are accessing are not merged
by KSM. In our case (building the kernel), IIUC the mostly possible
shared pages are system image pages, however when building the kernel
I'm thinking whether these pages will be frequently accesses, and
whether this could lead to similar performance numbers.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists