linux-kernel - Re: [PATCH v11 4/6] mm: function to offer a page block on the free list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170620212107-mutt-send-email-mst@kernel.org>
Date:   Tue, 20 Jun 2017 21:26:15 +0300
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Rik van Riel <riel@...hat.com>
Cc:     David Hildenbrand <david@...hat.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Wei Wang <wei.w.wang@...el.com>, linux-kernel@...r.kernel.org,
        qemu-devel@...gnu.org, virtualization@...ts.linux-foundation.org,
        kvm@...r.kernel.org, linux-mm@...ck.org, cornelia.huck@...ibm.com,
        akpm@...ux-foundation.org, mgorman@...hsingularity.net,
        aarcange@...hat.com, amit.shah@...hat.com, pbonzini@...hat.com,
        liliang.opensource@...il.com, Nitesh Narayan Lal <nilal@...hat.com>
Subject: Re: [PATCH v11 4/6] mm: function to offer a page block on the free
 list

On Tue, Jun 20, 2017 at 01:29:00PM -0400, Rik van Riel wrote:
> On Tue, 2017-06-20 at 18:49 +0200, David Hildenbrand wrote:
> > On 20.06.2017 18:44, Rik van Riel wrote:
> 
> > > Nitesh Lal (on the CC list) is working on a way
> > > to efficiently batch recently freed pages for
> > > free page hinting to the hypervisor.
> > > 
> > > If that is done efficiently enough (eg. with
> > > MADV_FREE on the hypervisor side for lazy freeing,
> > > and lazy later re-use of the pages), do we still
> > > need the harder to use batch interface from this
> > > patch?
> > > 
> > 
> > David's opinion incoming:
> > 
> > No, I think proper free page hinting would be the optimum solution,
> > if
> > done right. This would avoid the batch interface and even turn
> > virtio-balloon in some sense useless.
> 
> I agree with that.  Let me go into some more detail of
> what Nitesh is implementing:
> 
> 1) In arch_free_page, the being-freed page is added
>    to a per-cpu set of freed pages.
> 2) Once that set is full, arch_free_pages goes into a
>    slow path, which:
>    2a) Iterates over the set of freed pages, and
>    2b) Checks whether they are still free, and
>    2c) Adds the still free pages to a list that is
>        to be passed to the hypervisor, to be MADV_FREEd.
>    2d) Makes that hypercall.
> 
> Meanwhile all arch_alloc_pages has to do is make sure it
> does not allocate a page while it is currently being
> MADV_FREEd on the hypervisor side.
> 
> The code Wei is working on looks like it could be 
> suitable for steps (2c) and (2d) above. Nitesh already
> has code for steps 1 through 2b.
> 
> -- 
> All rights reversed


So my question is this: Wei posted these numbers for balloon
inflation times:
inflating 7GB of an 8GB idle guest:

	1) allocating pages (6.5%)
	2) sending PFNs to host (68.3%)
	3) address translation (6.1%)
	4) madvise (19%)

	It takes about 4126ms for the inflating process to complete.

It seems that this is an excessive amount of time to stay
under a lock. What are your estimates for Nitesh's work?

-- 
MST