lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGdbjm+CHK68W-7SrEYg5sZe5H8ujQL+VPjPL4EM4pSkkqP1tA@mail.gmail.com>
Date: Fri, 17 Jan 2025 14:33:10 -0800
From: Kevin Loughlin <kevinloughlin@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Zheyun Shen <szy0127@...u.edu.cn>, thomas.lendacky@....com, pbonzini@...hat.com, 
	tglx@...utronix.de, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] KVM: SVM: Flush cache only on CPUs running SEV guest

On Tue, Dec 10, 2024 at 3:56 PM Kevin Loughlin <kevinloughlin@...gle.com> wrote:
>
> On Wed, Dec 4, 2024 at 2:07 PM Sean Christopherson <seanjc@...gle.com> wrote:
> >
> > On Wed, Dec 04, 2024, Kevin Loughlin wrote:
> > > On Tue, Dec 3, 2024 at 4:27 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > > > > @@ -2152,7 +2191,7 @@ void sev_vm_destroy(struct kvm *kvm)
> > > > >        * releasing the pages back to the system for use. CLFLUSH will
> > > > >        * not do this, so issue a WBINVD.
> > > > >        */
> > > > > -     wbinvd_on_all_cpus();
> > > > > +     sev_do_wbinvd(kvm);
> > > >
> > > > I am 99% certain this wbinvd_on_all_cpus() can simply be dropped.  sev_vm_destroy()
> > > > is called after KVM's mmu_notifier has been unregistered, which means it's called
> > > > after kvm_mmu_notifier_release() => kvm_arch_guest_memory_reclaimed().
> > >
> > > I think we need a bit of rework before dropping it (which I propose we
> > > do in a separate series), but let me know if there's a mistake in my
> > > reasoning here...
> > >
> > > Right now, sev_guest_memory_reclaimed() issues writebacks for SEV and
> > > SEV-ES guests but does *not* issue writebacks for SEV-SNP guests.
> > > Thus, I believe it's possible a SEV-SNP guest reaches sev_vm_destroy()
> > > with dirty encrypted lines in processor caches. Because SME_COHERENT
> > > doesn't guarantee coherence across CPU-DMA interactions (d45829b351ee
> > > ("KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT
> > > CPUs")), it seems possible that the memory gets re-allocated for DMA,
> > > written back from an (unencrypted) DMA, and then corrupted when the
> > > dirty encrypted version gets written back over that, right?
> > >
> > > And potentially the same thing for why we can't yet drop the writeback
> > > in sev_flush_encrypted_page() without a bit of rework?
> >
> > Argh, this last one probably does apply to SNP.  KVM requires SNP VMs to be backed
> > with guest_memfd, and flushing for that memory is handled by sev_gmem_invalidate().
> > But the VMSA is kernel allocated and so needs to be flushed manually.  On the plus
> > side, the VMSA flush shouldn't use WB{NO}INVD unless things go sideways, so trying
> > to optimize that path isn't worth doing.
>
> Ah thanks, yes agreed for both (that dropping WB{NO}INVD is fine on
> the sev_vm_destroy() path given sev_gmem_invalidate() and that the
> sev_flush_encrypted_page() path still needs the WB{NO}INVD as a
> fallback for now).
>
> On that note, the WBINVD in sev_mem_enc_unregister_region() can be
> dropped too then, right? My understanding is that the host will
> instead do WB{NO}INVD for SEV(-ES) guests in
> sev_guest_memory_reclaimed(), and sev_gmem_invalidate() will handle
> SEV-SNP guests.

Nevermind, we can't drop the WBINVD call in
sev_mem_enc_unregister_region() without a userspace opt-in because
userspace might otherwise rely on the flushing behavior; see Sean's
explanation in [0].

So all-in-all I believe...

- we can drop the call in sev_vm_destroy()
- we *cannot* drop the call in sev_flush_encrypted_page(), nor in
sev_mem_enc_unregister_region().

Zheyun, if you get to this series before my own WBNOINVD series [1], I
can just rebase on top of yours. I will defer cutting these unneeded
calls to you and simply replace applicable WBINVD calls with WBNOINVD
in my series.

[0] https://lore.kernel.org/all/ZWrM622xUb4pe7gS@google.com/T/#md364d1fdfc65dc92e306276bd51298cb817c5e53.
[1] https://lore.kernel.org/kvm/20250109225533.1841097-2-kevinloughlin@google.com/T/
>
> All in all, I now agree we can drop the unneeded case(s) of issuing
> WB{NO}INVDs in this series in an additional commit. I'll then rebase
> [0] on the latest version of this series and can also work on the
> migration optimizations atop all of it, if that works for you Sean.
>
> [0] https://lore.kernel.org/lkml/20241203005921.1119116-1-kevinloughlin@google.com/
>
> Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ