linux-kernel - Re: [PATCH 0/2] KVM: x86: Fix and cleanup for recent AVIC changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YWmt/A4pemf2050j@google.com>
Date:   Fri, 15 Oct 2021 16:36:12 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Maxim Levitsky <mlevitsk@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] KVM: x86: Fix and cleanup for recent AVIC changes

On Fri, Oct 15, 2021, Paolo Bonzini wrote:
> On 15/10/21 18:15, Sean Christopherson wrote:
> > > 
> > >                                          - now vCPU1 finally starts running the page fault code.
> > > 
> > >                                          - vCPU1 AVIC is still enabled
> > >                                            (because vCPU1 never handled KVM_REQ_APICV_UPDATE),
> > >                                            so the page fault code will populate the SPTE.
> > But vCPU1 won't install the SPTE if it loses the race to acquire mmu_lock, because
> > kvm_zap_gfn_range() bumps the notifier sequence and so vCPU1 will retry the fault.
> > If vCPU1 wins the race, i.e. sees the same sequence number, then the zap is
> > guaranteed to find the newly-installed SPTE.
> > 
> > And IMO, retrying is the desired behavior.  Installing a SPTE based on the global
> > state works, but it's all kinds of weird to knowingly take an action the directly
> > contradicts the current vCPU state.
> 
> I think both of you are correct. :)
> 
> Installing a SPTE based on global state is weird because this is a vCPU
> action; installing it based on vCPU state is weird because it is knowingly
> out of date.

If that's the argument, then kvm_faultin_page() should explicitly check for a
pending KVM_REQ_APICV_UPDATE, because I would then argue that contuining on when
KVM _knows_ its new SPTE will either get zapped (page fault wins the race) or
will get rejected (kvm_zap_gfn_range() wins the race) is just as wrong.  The SPTE
_cannot_ be used even if the page fault wins the race, becuase all vCPUs need to
process KVM_REQ_APICV_UPDATE and thus will be blocked until the initiating vCPU
zaps the range and drops the APICv lock.

And I personally do _not_ want to add a check for the request because it implies
the check is sufficient, which it is not, because the page fault doesn't yet hold
mmu_lock.

Since all answers are some form of wrong, IMO we should at least be coherent with
respect to the original page fault.