lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 May 2020 16:33:52 -0400
From:   Vivek Goyal <vgoyal@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Sean Christopherson <sean.j.christopherson@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org,
        x86@...nel.org, Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Gavin Shan <gshan@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/8] KVM: x86: extend struct kvm_vcpu_pv_apf_data with
 token info

On Fri, May 15, 2020 at 09:18:07PM +0200, Paolo Bonzini wrote:
> On 15/05/20 20:46, Sean Christopherson wrote:
> >> The new one using #VE is not coming very soon (we need to emulate it for
> >> <Broadwell and AMD processors, so it's not entirely trivial) so we are
> >> going to keep "page not ready" delivery using #PF for some time or even
> >> forever.  However, page ready notification as #PF is going away for good.
> > 
> > And isn't hardware based EPT Violation #VE going to require a completely
> > different protocol than what is implemented today?  For hardware based #VE,
> > KVM won't intercept the fault, i.e. the guest will need to make an explicit
> > hypercall to request the page.
> 
> Yes, but it's a fairly simple hypercall to implement.
> 
> >> That said, type1/type2 is quite bad. :)  Let's change that to page not
> >> present / page ready.
> > 
> > Why even bother using 'struct kvm_vcpu_pv_apf_data' for the #PF case?  VMX
> > only requires error_code[31:16]==0 and SVM doesn't vet it at all, i.e. we
> > can (ab)use the error code to indicate an async #PF by setting it to an
> > impossible value, e.g. 0xaaaa (a is for async!).  That partciular error code
> > is even enforced by the SDM, which states:
> 
> Possibly, but it's water under the bridge now.
> And the #PF mechanism also has the problem with NMIs that happen before
> the error code is read
> and page faults happening in the handler (you may connect some dots now).

I understood that following was racy.

do_async_page_fault <--- kvm injected async page fault
  NMI happens (Before kvm_read_and_reset_pf_reason() is done)
   ->do_async_page_fault() (This is regular page fault but it will read
   			    reason from shared area and will treat itself
			    as async page fault)

So this is racy.

But if we get rid of the notion of reading from shared region in page
fault handler, will we not get rid of this race.

I am assuming that error_code is not racy as it is pushed on stack.
What am I missing.

Thanks
Vivek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ