[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <198b3382-e511-2f7d-e738-5aedcb1a09e6@redhat.com>
Date: Fri, 8 Mar 2019 12:53:26 +0100
From: David Hildenbrand <david@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Alexander Duyck <alexander.duyck@...il.com>,
Nitesh Narayan Lal <nitesh@...hat.com>,
kvm list <kvm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
Paolo Bonzini <pbonzini@...hat.com>, lcapitulino@...hat.com,
pagupta@...hat.com, wei.w.wang@...el.com,
Yang Zhang <yang.zhang.wz@...il.com>,
Rik van Riel <riel@...riel.com>, dodgen@...gle.com,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
dhildenb@...hat.com, Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: [RFC][Patch v9 0/6] KVM: Guest Free Page Hinting
On 08.03.19 03:24, Michael S. Tsirkin wrote:
> On Thu, Mar 07, 2019 at 08:27:32PM +0100, David Hildenbrand wrote:
>> On 07.03.19 19:53, Michael S. Tsirkin wrote:
>>> On Thu, Mar 07, 2019 at 10:45:58AM -0800, Alexander Duyck wrote:
>>>> To that end what I think w may want to do is instead just walk the LRU
>>>> list for a given zone/order in reverse order so that we can try to
>>>> identify the pages that are most likely to be cold and unused and
>>>> those are the first ones we want to be hinting on rather than the ones
>>>> that were just freed. If we can look at doing something like adding a
>>>> jiffies value to the page indicating when it was last freed we could
>>>> even have a good point for determining when we should stop processing
>>>> pages in a given zone/order list.
>>>>
>>>> In reality the approach wouldn't be too different from what you are
>>>> doing now, the only real difference would be that we would just want
>>>> to walk the LRU list for the given zone/order rather then pulling
>>>> hints on what to free from the calls to free_one_page. In addition we
>>>> would need to add a couple bits to indicate if the page has been
>>>> hinted on, is in the middle of getting hinted on, and something such
>>>> as the jiffies value I mentioned which we could use to determine how
>>>> old the page is.
>>>
>>> Do we really need bits in the page?
>>> Would it be bad to just have a separate hint list?
>>>
>>> If you run out of free memory you can check the hint
>>> list, if you find stuff there you can spin
>>> or kick the hypervisor to hurry up.
>>>
>>> Core mm/ changes, so nothing's easy, I know.
>>
>> We evaluated the idea of busy spinning on some bit/list entry a while
>> ago. While it sounds interesting, it is usually not what we want and has
>> other negative performance impacts.
>>
>> Talking about "marking" pages, what we actually would want is to rework
>> the buddy to skip over these "marked" pages and only really spin in case
>> there are no other pages left. Allocation paths should only ever be
>> blocked if OOM, not if just some hinting activity is going on on another
>> VCPU.
>>
>> However as you correctly say: "core mm changes". New page flag?
>> Basically impossible.
>
> Well not exactly. page bits are at a premium but only for
> *allocated* pages. pages in the buddy are free and there are
> some unused bits for these.
>
As I said, we have to be very careful here.
Most parts of struct page can me modified by *the owner* of the page. In
case the page is online but not allocated, buddy is the owner. Not some
kvm/virtio thingy that hooks into some callback.
Manipulating random page bits of buddy pages in *some* kernel module I
consider problematic and will most probably not be accepted upstream.
What could work is, factoring out these parts e.g. into
mm/page_hinting.c, then it gets part of the core mm in some way. Which
would actually be a nice thing to do either way we go.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists