linux-kernel - Re: [PATCH] KVM: nVMX: Always use TLB_FLUSH

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkahmyjXvwKO2=EfQRWu_BHPJ-8+eSEteZH5TGG3+jHtWw@mail.gmail.com>
Date: Thu, 16 Jan 2025 16:53:24 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Jim Mattson <jmattson@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM: nVMX: Always use TLB_FLUSH_GUEST for nested VM-Enter/VM-Exit

On Thu, Jan 16, 2025 at 4:35 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Thu, Jan 16, 2025, Yosry Ahmed wrote:
> > On Thu, Jan 16, 2025 at 2:35 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > > How about:
> > >
> > >          * Note, only the hardware TLB entries need to be flushed, as VPID is
> > >          * fully enabled from L1's perspective, i.e. there's no architectural
> > >          * TLB flush from L1's perspective.
> >
> > I hate to bikeshed, but I want to explicitly call out that we do not
> > need to synchronize the MMU.
>
> Why?  Honest question, I want to understand what's unclear.  My hesitation to
> talk about synchronizing MMUs is that it brings things into play that aren't
> super relevant to this specific code, and might even add confusion.  Specifically,
> kvm_vcpu_flush_tlb_guest() does NOT synchronize MMUs when EPT/TDP is enabled, but
> the fact that this path is reachable if and only if EPT is enabled is completely
> coincidental.

Personally, the main thing that was unclear to me and I wanted to add
a comment to clarify was why we use KVM_REQ_TLB_FLUSH_GUEST in the
first two cases but KVM_REQ_TLB_FLUSH_CURRENT in the last one.

Here's my understanding:

In the first case (i.e. !nested_cpu_has_vpid(vmcs12)), the flush is
architecturally required from L1's perspective to we need to flush
guest-generated TLB entries (and potentially synchronize KVM's MMU).

In the second case, KVM does not track the history of VPID12, so the
flush *may* be architecturally required from L1's perspective, so we
do the same thing.

In the last case though, the flush is NOT architecturally required
from L1's perspective, it's just an artifact of KVM's potential
failure to allocate a dedicated VPID for L2 despite L1 asking for it.

So ultimately, I don't want to specifically call out synchronizing
MMUs, as much as I want to call out why this case uses
KVM_REQ_TLB_FLUSH_CURRENT and not KVM_REQ_TLB_FLUSH_GUEST like the
others. I only suggested calling out the MMU synchronization since
it's effectively the only difference between the two in this case.

I am open to any wording you think is best. I am also fine with just
dropping this completely, definitely not the hill to die on :)

>
> E.g. very hypothetically, if KVM used the same EPT root (I already forgot Intel's
> new acronym) for L1 and L2, then this would no longer be true:
>
>  * If L0 uses EPT, L1 and L2 run with different EPTP because
>  * guest_mode is part of kvm_mmu_page_role. Thus, TLB entries
>  * are tagged with different EPTP.
>
> L1 vs. L2 EPT usage would no longer use separate ASID tags, and so KVM would
> need to FLUSH_CURRENT on transitions in most cases, e.g. to purge APICv mappings.
>
> The comment above !nested_cpu_has_vpid() talks at length about synchronizing MMUs
> because the EPT behavior in particular is subtle and rather unintuitive.  I.e.
> the comment is much more about NOT using KVM_REQ_MMU_SYNC than it is about using
> KVM_REQ_TLB_FLUSH_GUEST.