lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190218160411.GA3142@redhat.com>
Date:   Mon, 18 Feb 2019 11:04:13 -0500
From:   Jerome Glisse <jglisse@...hat.com>
To:     Andrea Arcangeli <aarcange@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Peter Xu <peterx@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Matthew Wilcox <mawilcox@...rosoft.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Michal Hocko <mhocko@...nel.org>, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 0/4] Restore change_pte optimization to its former
 glory

On Mon, Feb 11, 2019 at 03:02:00PM -0500, Andrea Arcangeli wrote:
> On Mon, Feb 11, 2019 at 02:09:31PM -0500, Jerome Glisse wrote:
> > Yeah, between do you have any good workload for me to test this ? I
> > was thinking of running few same VM and having KSM work on them. Is
> > there some way to trigger KVM to fork ? As the other case is breaking
> > COW after fork.
> 
> KVM can fork on guest pci-hotplug events or network init to run host
> scripts and re-init the signals before doing the exec, but it won't
> move the needle because all guest memory registered in the MMU
> notifier is set as MADV_DONTFORK... so fork() is a noop unless qemu is
> also modified not to call MADV_DONTFORK.
> 
> Calling if (!fork()) exit(0) from a timer at regular intervals during
> qemu runtime after turning off MADV_DONTFORK in qemu would allow to
> exercise fork against the KVM MMU Notifier methods.
> 
> The optimized change_pte code in copy-on-write code is the same
> post-fork or post-KSM merge and fork() itself doesn't use change_pte
> while KSM does, so with regard to change_pte it should already provide
> a good test coverage to test with only KSM without fork(). It'll cover
> the read-write -> readonly transition with same PFN
> (write_protect_page), the read-only to read-only changing PFN
> (replace_page) as well as the readonly -> read-write transition
> changing PFN (wp_page_copy) all three optimized with change_pte. Fork
> would not leverage change_pte for the first two cases.

So i run 2 exact same VMs side by side (copy of same COW image) and
built the same kernel tree inside each (that is the only important
workload that exist ;)) but the change_pte did not have any impact:

before  mean  {real: 1358.250977, user: 16650.880859, sys: 839.199524, npages: 76855.390625}
before  stdev {real:    6.744010, user:   108.863762, sys:   6.840437, npages:  1868.071899}
after   mean  {real: 1357.833740, user: 16685.849609, sys: 839.646973, npages: 76210.601562}
after   stdev {real:    5.124797, user:    78.469360, sys:   7.009164, npages:  2468.017578}
without mean  {real: 1358.501343, user: 16674.478516, sys: 837.791992, npages: 76225.203125}
without stdev {real:    5.541104, user:    97.998367, sys:   6.715869, npages:  1682.392578}

Above is time taken by make inside each VM for all yes config. npages
is the number of page shared reported on the host at the end of the
build.

There is no change before and after the patchset to restore change
pte. I tried removing the change_pte callback alltogether to see if
that did have any effect (without above) and it did not have any
effect either.

Should we still restore change_pte() ? It does not hurt, but it does
not seems to help in anyway. Maybe you have a better benchmark i could
run ?

Cheers,
Jérôme

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ