[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190204181118.12095.38300.stgit@localhost.localdomain>
Date: Mon, 04 Feb 2019 10:15:33 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Cc: rkrcmar@...hat.com, alexander.h.duyck@...ux.intel.com,
x86@...nel.org, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
pbonzini@...hat.com, tglx@...utronix.de, akpm@...ux-foundation.org
Subject: [RFC PATCH 0/4] kvm: Report unused guest pages to host
This patch set provides a mechanism by which guests can notify the host of
pages that are not currently in use. Using this data a KVM host can more
easily balance memory workloads between guests and improve overall system
performance by avoiding unnecessary writing of unused pages to swap.
In order to support this I have added a new hypercall to provided unused
page hints and made use of mechanisms currently used by PowerPC and s390
architectures to provide those hints. To reduce the overhead of this call
I am only using it per huge page instead of of doing a notification per 4K
page. By doing this we can avoid the expense of fragmenting higher order
pages, and reduce overall cost for the hypercall as it will only be
performed once per huge page.
Because we are limiting this to huge pages it was necessary to add a
secondary location where we make the call as the buddy allocator can merge
smaller pages into a higher order huge page.
This approach is not usable in all cases. Specifically, when KVM direct
device assignment is used, the memory for a guest is permanently assigned
to physical pages in order to support DMA from the assigned device. In
this case we cannot give the pages back, so the hypercall is disabled by
the host.
Another situation that can lead to issues is if the page were accessed
immediately after free. For example, if page poisoning is enabled the
guest will populate the page *after* freeing it. In this case it does not
make sense to provide a hint about the page being freed so we do not
perform the hypercalls from the guest if this functionality is enabled.
My testing up till now has consisted of setting up 4 8GB VMs on a system
with 32GB of memory and 4GB of swap. To stress the memory on the system I
would run "memhog 8G" sequentially on each of the guests and observe how
long it took to complete the run. The observed behavior is that on the
systems with these patches applied in both the guest and on the host I was
able to complete the test with a time of 5 to 7 seconds per guest. On a
system without these patches the time ranged from 7 to 49 seconds per
guest. I am assuming the variability is due to time being spent writing
pages out to disk in order to free up space for the guest.
---
Alexander Duyck (4):
madvise: Expose ability to set dontneed from kernel
kvm: Add host side support for free memory hints
kvm: Add guest side support for free memory hints
mm: Add merge page notifier
Documentation/virtual/kvm/cpuid.txt | 4 ++
Documentation/virtual/kvm/hypercalls.txt | 14 ++++++++
arch/x86/include/asm/page.h | 25 +++++++++++++++
arch/x86/include/uapi/asm/kvm_para.h | 3 ++
arch/x86/kernel/kvm.c | 51 ++++++++++++++++++++++++++++++
arch/x86/kvm/cpuid.c | 6 +++-
arch/x86/kvm/x86.c | 35 +++++++++++++++++++++
include/linux/gfp.h | 4 ++
include/linux/mm.h | 2 +
include/uapi/linux/kvm_para.h | 1 +
mm/madvise.c | 13 +++++++-
mm/page_alloc.c | 2 +
12 files changed, 158 insertions(+), 2 deletions(-)
--
Powered by blists - more mailing lists