[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9cddf98d-e2ce-0f8a-d46c-e15a54bc7391@redhat.com>
Date: Fri, 2 Aug 2019 10:41:24 -0400
From: Nitesh Narayan Lal <nitesh@...hat.com>
To: Alexander Duyck <alexander.duyck@...il.com>, kvm@...r.kernel.org,
david@...hat.com, mst@...hat.com, dave.hansen@...el.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
akpm@...ux-foundation.org
Cc: yang.zhang.wz@...il.com, pagupta@...hat.com, riel@...riel.com,
konrad.wilk@...cle.com, willy@...radead.org,
lcapitulino@...hat.com, wei.w.wang@...el.com, aarcange@...hat.com,
pbonzini@...hat.com, dan.j.williams@...el.com,
alexander.h.duyck@...ux.intel.com
Subject: Re: [PATCH v3 0/6] mm / virtio: Provide support for unused page
reporting
On 8/1/19 6:24 PM, Alexander Duyck wrote:
> This series provides an asynchronous means of reporting to a hypervisor
> that a guest page is no longer in use and can have the data associated
> with it dropped. To do this I have implemented functionality that allows
> for what I am referring to as unused page reporting
>
> The functionality for this is fairly simple. When enabled it will allocate
> statistics to track the number of reported pages in a given free area.
> When the number of free pages exceeds this value plus a high water value,
> currently 32, it will begin performing page reporting which consists of
> pulling pages off of free list and placing them into a scatter list. The
> scatterlist is then given to the page reporting device and it will perform
> the required action to make the pages "reported", in the case of
> virtio-balloon this results in the pages being madvised as MADV_DONTNEED
> and as such they are forced out of the guest. After this they are placed
> back on the free list, and an additional bit is added if they are not
> merged indicating that they are a reported buddy page instead of a
> standard buddy page. The cycle then repeats with additional non-reported
> pages being pulled until the free areas all consist of reported pages.
>
> I am leaving a number of things hard-coded such as limiting the lowest
> order processed to PAGEBLOCK_ORDER, and have left it up to the guest to
> determine what the limit is on how many pages it wants to allocate to
> process the hints. The upper limit for this is based on the size of the
> queue used to store the scatterlist.
>
> My primary testing has just been to verify the memory is being freed after
> allocation by running memhog 40g on a 40g guest and watching the total
> free memory via /proc/meminfo on the host. With this I have verified most
> of the memory is freed after each iteration. As far as performance I have
> been mainly focusing on the will-it-scale/page_fault1 test running with
> 16 vcpus. With that I have seen up to a 2% difference between the base
> kernel without these patches and the patches with virtio-balloon enabled
> or disabled.
A couple of questions:
- The 2% difference which you have mentioned, is this visible for
all the 16 cores or just the 16th core?
- I am assuming that the difference is seen for both "number of process"
and "number of threads" launched by page_fault1. Is that right?
>
> One side effect of these patches is that the guest becomes much more
> resilient in terms of NUMA locality. With the pages being freed and then
> reallocated when used it allows for the pages to be much closer to the
> active thread, and as a result there can be situations where this patch
> set will out-perform the stock kernel when the guest memory is not local
> to the guest vCPUs.
Was this the reason because of which you were seeing better results for
page_fault1 earlier?
>
> Patch 4 is a bit on the large side at about 600 lines of change, however
> I really didn't see a good way to break it up since each piece feeds into
> the next. So I couldn't add the statistics by themselves as it didn't
> really make sense to add them without something that will either read or
> increment/decrement them, or add the Hinted state without something that
> would set/unset it. As such I just ended up adding the entire thing as
> one patch. It makes it a bit bigger but avoids the issues in the previous
> set where I was referencing things that had not yet been added.
>
> Changes from the RFC:
> https://lore.kernel.org/lkml/20190530215223.13974.22445.stgit@localhost.localdomain/
> Moved aeration requested flag out of aerator and into zone->flags.
> Moved boundary out of free_area and into local variables for aeration.
> Moved aeration cycle out of interrupt and into workqueue.
> Left nr_free as total pages instead of splitting it between raw and aerated.
> Combined size and physical address values in virtio ring into one 64b value.
>
> Changes from v1:
> https://lore.kernel.org/lkml/20190619222922.1231.27432.stgit@localhost.localdomain/
> Dropped "waste page treatment" in favor of "page hinting"
> Renamed files and functions from "aeration" to "page_hinting"
> Moved from page->lru list to scatterlist
> Replaced wait on refcnt in shutdown with RCU and cancel_delayed_work_sync
> Virtio now uses scatterlist directly instead of intermediate array
> Moved stats out of free_area, now in separate area and pointed to from zone
> Merged patch 5 into patch 4 to improve review-ability
> Updated various code comments throughout
>
> Changes from v2:
> https://lore.kernel.org/lkml/20190724165158.6685.87228.stgit@localhost.localdomain/
> Dropped "page hinting" in favor of "page reporting"
> Renamed files from "hinting" to "reporting"
> Replaced "Hinted" page type with "Reported" page flag
> Added support for page poisoning while hinting is active
> Add QEMU patch that implements PAGE_POISON feature
>
> ---
>
> Alexander Duyck (6):
> mm: Adjust shuffle code to allow for future coalescing
> mm: Move set/get_pcppage_migratetype to mmzone.h
> mm: Use zone and order instead of free area in free_list manipulators
> mm: Introduce Reported pages
> virtio-balloon: Pull page poisoning config out of free page hinting
> virtio-balloon: Add support for providing unused page reports to host
>
>
> drivers/virtio/Kconfig | 1
> drivers/virtio/virtio_balloon.c | 75 ++++++++-
> include/linux/mmzone.h | 116 ++++++++------
> include/linux/page-flags.h | 11 +
> include/linux/page_reporting.h | 138 ++++++++++++++++
> include/uapi/linux/virtio_balloon.h | 1
> mm/Kconfig | 5 +
> mm/Makefile | 1
> mm/internal.h | 18 ++
> mm/memory_hotplug.c | 1
> mm/page_alloc.c | 238 ++++++++++++++++++++--------
> mm/page_reporting.c | 299 +++++++++++++++++++++++++++++++++++
> mm/shuffle.c | 24 ---
> mm/shuffle.h | 32 ++++
> 14 files changed, 821 insertions(+), 139 deletions(-)
> create mode 100644 include/linux/page_reporting.h
> create mode 100644 mm/page_reporting.c
>
> --
>
--
Thanks
Nitesh
Powered by blists - more mailing lists