linux-kernel - Re: [PATCH v3 00/21] KVM: Dirty ring interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200109133434-mutt-send-email-mst@kernel.org>
Date:   Thu, 9 Jan 2020 14:08:52 -0500
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Christophe de Dinechin <dinechin@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Yan Zhao <yan.y.zhao@...el.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Kevin Kevin <kevin.tian@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>
Subject: Re: [PATCH v3 00/21] KVM: Dirty ring interface

On Thu, Jan 09, 2020 at 12:08:49PM -0500, Peter Xu wrote:
> On Thu, Jan 09, 2020 at 11:40:23AM -0500, Michael S. Tsirkin wrote:
> 
> [...]
> 
> > > > I know it's mostly relevant for huge VMs, but OTOH these
> > > > probably use huge pages.
> > > 
> > > Yes huge VMs could benefit more, especially if the dirty rate is not
> > > that high, I believe.  Though, could you elaborate on why huge pages
> > > are special here?
> > > 
> > > Thanks,
> > 
> > With hugetlbfs there are less bits to test: e.g. with 2M pages a single
> > bit set marks 512 pages as dirty.  We do not take advantage of this
> > but it looks like a rather obvious optimization.
> 
> Right, but isn't that the trade-off between granularity of dirty
> tracking and how easy it is to collect the dirty bits?  Say, it'll be
> merely impossible to migrate 1G-huge-page-backed guests if we track
> dirty bits using huge page granularity, since each touch of guest
> memory will cause another 1G memory to be transferred even if most of
> the content is the same.  2M can be somewhere in the middle, but still
> the same write amplify issue exists.
>

OK I see I'm unclear.

IIUC at the moment KVM never uses huge pages if any part of the huge page is
tracked. But if all parts of the page are written to then huge page
is used.

In this situation the whole huge page is dirty and needs to be migrated.

> PS. that seems to be another topic after all besides the dirty ring
> series because we need to change our policy first if we want to track
> it with huge pages; with that, for dirty ring we can start to leverage
> the kvm_dirty_gfn.pad to store the page size with another new kvm cap
> when we really want.
> 
> Thanks,

Seems like leaking implementation detail to UAPI to me.


> -- 
> Peter Xu