[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200228134510.GA50843@cmpxchg.org>
Date: Fri, 28 Feb 2020 08:45:10 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...nel.org>
Cc: "Huang, Ying" <ying.huang@...el.com>,
David Hildenbrand <david@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Mel Gorman <mgorman@...e.de>,
Vlastimil Babka <vbabka@...e.cz>, Zi Yan <ziy@...dia.com>,
Peter Zijlstra <peterz@...radead.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Minchan Kim <minchan@...nel.org>,
Hugh Dickins <hughd@...gle.com>,
Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [RFC 0/3] mm: Discard lazily freed pages when migrating
On Fri, Feb 28, 2020 at 10:50:48AM +0100, Michal Hocko wrote:
> On Fri 28-02-20 16:55:40, Huang, Ying wrote:
> > David Hildenbrand <david@...hat.com> writes:
> [...]
> > > E.g., free page reporting in QEMU wants to use MADV_FREE. The guest will
> > > report currently free pages to the hypervisor, which will MADV_FREE the
> > > reported memory. As long as there is no memory pressure, there is no
> > > need to actually free the pages. Once the guest reuses such a page, it
> > > could happen that there is still the old page and pulling in in a fresh
> > > (zeroed) page can be avoided.
> > >
> > > AFAIKs, after your change, we would get more pages discarded from our
> > > guest, resulting in more fresh (zeroed) pages having to be pulled in
> > > when a guest touches a reported free page again. But OTOH, page
> > > migration is speed up (avoiding to migrate these pages).
> >
> > Let's look at this problem in another perspective. To migrate the
> > MADV_FREE pages of the QEMU process from the node A to the node B, we
> > need to free the original pages in the node A, and (maybe) allocate the
> > same number of pages in the node B. So the question becomes
> >
> > - we may need to allocate some pages in the node B
> > - these pages may be accessed by the application or not
> > - we should allocate all these pages in advance or allocate them lazily
> > when they are accessed.
> >
> > We thought the common philosophy in Linux kernel is to allocate lazily.
>
> The common philosophy is to cache as much as possible. And MADV_FREE
> pages are a kind of cache as well. If the target node is short on memory
> then those will be reclaimed as a cache so a pro-active freeing sounds
> counter productive as you do not have any idea whether that cache is
> going to be used in future. In other words you are not going to free a
> clean page cache if you want to use that memory as a migration target
> right? So you should make a clear case about why MADV_FREE cache is less
> important than the clean page cache and ideally have a good
> justification backed by real workloads.
Agreed.
MADV_FREE says that the *data* in the pages is no longer needed, and
so the pages are cheaper to reclaim than regular anon pages (swap).
But people use MADV_FREE in the hope that they can reuse the pages at
a later point - otherwise, they'd unmap them. We should retain them
until the memory is actively needed for other allocations.
Powered by blists - more mailing lists