linux-kernel - Re: [RFC V2 0/9] x86/mmu:Introduce parallel memory virtualization to boost performance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2592097d-3190-1862-b438-9e1b16616b82@redhat.com>
Date:   Fri, 25 Sep 2020 22:50:07 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Ben Gardon <bgardon@...gle.com>,
        yulei zhang <yulei.kernel@...il.com>
Cc:     Wanpeng Li <kernellwp@...il.com>, kvm <kvm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Jim Mattson <jmattson@...gle.com>,
        Junaid Shahid <junaids@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Xiao Guangrong <xiaoguangrong.eric@...il.com>,
        Haiwei Li <lihaiwei.kernel@...il.com>
Subject: Re: [RFC V2 0/9] x86/mmu:Introduce parallel memory virtualization to
 boost performance

On 25/09/20 19:30, Ben Gardon wrote:
> Oh, thank you for explaining that. I didn't realize the goal here was
> to improve LM performance. I was under the impression that this was to
> give VMs a better experience on startup for fast scaling or something.
> In your testing with live migration how has this affected the
> distribution of time between the phases of live migration? Just for
> terminology (since I'm not sure how standard it is across the
> industry) I think of a live migration as consisting of 3 stages:
> precopy, blackout, and postcopy. In precopy we're tracking the VM's
> working set via dirty logging and sending the contents of its memory
> to the target host. In blackout we pause the vCPUs on the source, copy
> minimal data to the target, and resume the vCPUs on the target. In
> postcopy we may still have some pages that have not been copied to the
> target and so request those in response to vCPU page faults via user
> fault fd or some other mechanism.
> 
> Does EPT pre-population preclude the use of a postcopy phase?

I think so.

As a quick recap, turn postcopy migration handles two kinds of
pages---they can be copied to the destination either in background
(stuff that was dirty when userspace decided to transition to the
blackout phase) or on-demand (relayed from KVM to userspace via
get_user_pages and userfaultfd).  Normally only on-demand pages would be
served through userfaultfd, while with prepopulation every missing page
would be faulted in from the kernel through userfaultfd.  In practice
this would just extend the blackout phase.

Paolo

> I would
> expect that to make the blackout phase really long. Has that not been
> a problem for you?
> 
> I love the idea of partial EPT pre-population during precopy if you
> could still handle postcopy and just pre-populate as memory came in.
>