linux-kernel - Re: [PATCH v5 15/19] KVM: Provide common implementation for generic dirty log functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sgj99q9w.fsf@vitty.brq.redhat.com>
Date:   Mon, 17 Feb 2020 16:39:39 +0100
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Sean Christopherson <sean.j.christopherson@...el.com>,
        Peter Xu <peterx@...hat.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Paul Mackerras <paulus@...abs.org>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Janosch Frank <frankja@...ux.ibm.com>,
        David Hildenbrand <david@...hat.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, Marc Zyngier <maz@...nel.org>,
        James Morse <james.morse@....com>,
        Julien Thierry <julien.thierry.kdev@...il.com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        linux-mips@...r.kernel.org, kvm@...r.kernel.org,
        kvm-ppc@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        kvmarm@...ts.cs.columbia.edu, linux-kernel@...r.kernel.org,
        Christoffer Dall <christoffer.dall@....com>,
        Philippe Mathieu-Daudé <f4bug@...at.org>
Subject: Re: [PATCH v5 15/19] KVM: Provide common implementation for generic dirty log functions

Sean Christopherson <sean.j.christopherson@...el.com> writes:

> On Fri, Feb 07, 2020 at 07:53:34PM -0500, Peter Xu wrote:
>> On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote:
>> > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote:
>> > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote:
>> > > > +Vitaly for HyperV
>> > > > 
>> > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote:
>> > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote:
>> > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote:
>> > > > > > > But that matters to this patch because if MIPS can use
>> > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this
>> > > > > > > arch-specific hook any more and we can directly call
>> > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true.
>> > > > > > 
>> > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing
>> > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no
>> > > > > > clue as to the important of that code.
>> > > > > 
>> > > > > As said above I think the x86 lockdep is really not necessary, then
>> > > > > considering MIPS could be the only one that will use the new hook
>> > > > > introduced in this patch...  Shall we figure that out first?
>> > > > 
>> > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a
>> > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86,
>> > > > but then I realized x86 *has* a hook to do a precise remote TLB flush.
>> > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a
>> > > > memslot, i.e. this exact scenario.  So arguably, x86 should be using the
>> > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush().
>> > > > 
>> > > > But, the hook is only used when KVM is running as an L1 on top of HyperV,
>> > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on
>> > > > HyperV?
>> > > > 
>> > > > I see three options:
>> > > > 
>> > > >   1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call
>> > > >      kvm_flush_remote_tlbs() directly for arm and x86.  Add comments to
>> > > >      explain when an arch should implement kvm_arch_dirty_log_tlb_flush().
>> > > > 
>> > > >   2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing
>> > > >      a memslot after the dirty log is grabbed by userspace.
>> > > > 
>> > > >   3. Keep the resulting code as is, but add a comment in x86's
>> > > >      kvm_arch_dirty_log_tlb_flush() to explain why it uses
>> > > >      kvm_flush_remote_tlbs() instead of the with_address() variant.
>> > > > 
>> > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of
>> > > > those is preferable.
>> > > > 
>> > > > I don't like (1) because (a) it requires more lines code (well comments),
>> > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would
>> > > > require even more comments, which would be x86-specific in generic KVM,
>> > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that
>> > > > info altogether.
>> > > > 
>> > > 
>> > > I proposed the 4th solution here:
>> > > 
>> > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/
>> > > 
>> > > I'm not sure whether that's acceptable, but if it can, then we can
>> > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to
>> > > per-slot tlb flushing.
>> > 
>> > This effectively is per-slot TLB flushing, it just has a different name.
>> > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot.
>> > I'm not opposed to that name change.  And on second and third glance, I
>> > probably prefer it.  That would more or less follow the naming of
>> > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot().
>> 
>> Note that the major point of the above patchset is not about doing tlb
>> flush per-memslot or globally.  It's more about whether we can provide
>> a common entrance for TLB flushing.  Say, after that series, we should
>> be able to flush TLB on all archs (majorly, including MIPS) as:
>> 
>>   kvm_flush_remote_tlbs(kvm);
>> 
>> And with the same idea we can also introduce the ranged version.
>> 
>> > 
>> > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address()
>> > because that loses the important distinction (on x86) that slots_lock is
>> > expected to be held.
>> 
>> Sorry I'm still puzzled on why that lockdep is so important and
>> special for x86...  For example, what if we move that lockdep to the
>> callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects
>> even more arch (where we do get/clear dirty log)?  IMHO the callers
>> must be with the slots_lock held anyways no matter for x86 or not.
>
>
> Following the breadcrumbs leads to the comment in
> kvm_mmu_slot_remove_write_access(), which says:
>
>         /*
>          * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
>          * which do tlb flush out of mmu-lock should be serialized by
>          * kvm->slots_lock otherwise tlb flush would be missed.
>          */
>
> I.e. write-protecting a memslot and grabbing the dirty log for the memslot
> need to be serialized.  It's quite obvious *now* that get_dirty_log() holds
> slots_lock, but the purpose of lockdep assertions isn't just to verify the
> current functionality, it's to help ensure the correctness for future code
> and to document assumptions in the code.
>
> Digging deeper, there are four functions, all related to dirty logging, in
> the x86 mmu that basically open code what x86's
> kvm_arch_flush_remote_tlbs_memslot() would look like if it uses the range
> based flushing.
>
> Unless it's functionally incorrect (Vitaly?), going with option (2) and
> naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious
> choice, e.g. the final cleanup gives this diff stat:

(I apologize again for not replying in time)

I think this is a valid approach and your option (2) would also be my
choice. I also don't think there's going to be a problem when (if)
Hyper-V adds support for PML (eVMCSv2?).

>
>  arch/x86/kvm/mmu/mmu.c | 34 +++++++++-------------------------
>  1 file changed, 9 insertions(+), 25 deletions(-)
>

Looks nice :-)

-- 
Vitaly