linux-kernel - Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200408203425.GD93547@redhat.com>
Date:   Wed, 8 Apr 2020 16:34:25 -0400
From:   Vivek Goyal <vgoyal@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Andy Lutomirski <luto@...capital.net>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        kvm list <kvm@...r.kernel.org>, stable <stable@...r.kernel.org>
Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS

On Wed, Apr 08, 2020 at 08:01:36PM +0200, Thomas Gleixner wrote:
> Paolo Bonzini <pbonzini@...hat.com> writes:
> > On 08/04/20 17:34, Sean Christopherson wrote:
> >> On Wed, Apr 08, 2020 at 10:23:58AM +0200, Paolo Bonzini wrote:
> >>> Page-not-present async page faults are almost a perfect match for the
> >>> hardware use of #VE (and it might even be possible to let the processor
> >>> deliver the exceptions).
> >> 
> >> My "async" page fault knowledge is limited, but if the desired behavior is
> >> to reflect a fault into the guest for select EPT Violations, then yes,
> >> enabling EPT Violation #VEs in hardware is doable.  The big gotcha is that
> >> KVM needs to set the suppress #VE bit for all EPTEs when allocating a new
> >> MMU page, otherwise not-present faults on zero-initialized EPTEs will get
> >> reflected.
> >> 
> >> Attached a patch that does the prep work in the MMU.  The VMX usage would be:
> >> 
> >> 	kvm_mmu_set_spte_init_value(VMX_EPT_SUPPRESS_VE_BIT);
> >> 
> >> when EPT Violation #VEs are enabled.  It's 64-bit only as it uses stosq to
> >> initialize EPTEs.  32-bit could also be supported by doing memcpy() from
> >> a static page.
> >
> > The complication is that (at least according to the current ABI) we
> > would not want #VE to kick if the guest currently has IF=0 (and possibly
> > CPL=0).  But the ABI is not set in stone, and anyway the #VE protocol is
> > a decent one and worth using as a base for whatever PV protocol we design.
> 
> Forget the current pf async semantics (or the lack of). You really want
> to start from scratch and igore the whole thing.
> 
> The charm of #VE is that the hardware can inject it and it's not nesting
> until the guest cleared the second word in the VE information area. If
> that word is not 0 then you get a regular vmexit where you suspend the
> vcpu until the nested problem is solved.

So IIUC, only one process on a vcpu could affort to relinquish cpu to
another task. If next task also triggers EPT violation, that will result
in VM exit (as previous #VE is not complete yet) and vcpu will be halted.

> 
> So you really don't worry about the guest CPU state at all. The guest
> side #VE handler has to decide what it wants from the host depending on
> it's internal state:
> 
>      - Suspend me and resume once the EPT fail is solved
> 
>      - Let me park the failing task and tell me once you resolved the
>        problem.
> 
> That's pretty straight forward and avoids the whole nonsense which the
> current mess contains. It completely avoids the allocation stuff as well
> as you need to use a PV page where the guest copies the VE information
> to.
> 
> The notification that a problem has been resolved needs to go through a
> separate vector which still has the IF=1 requirement obviously.

How is this vector decided between guest and host. Failure to fault in
page will be communicated through same vector?

Thanks
Vivek