linux-kernel - Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <92ea7036-0b77-20da-34ac-f425e6f233c2@redhat.com>
Date:   Thu, 9 Apr 2020 11:03:50 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...capital.net>,
        Vivek Goyal <vgoyal@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        kvm list <kvm@...r.kernel.org>, stable <stable@...r.kernel.org>
Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS

On 08/04/20 15:01, Thomas Gleixner wrote:
> 
> And it comes with restrictions:
> 
>     The Do Other Stuff event can only be delivered when guest IF=1.
> 
>     If guest IF=0 then the host has to suspend the guest until the
>     situation is resolved.
> 
>     The 'Situation resolved' event must also wait for a guest IF=1 slot.

Additionally:

- the do other stuff event must be delivered to the same CPU that is
causing the host-side page fault

- the do other stuff event provides a token that identifies the cause
and the situation resolved event provides a matching token

This stuff is why I think the do other stuff event looks very much like
a #VE.  But I think we're in violent agreement after all.

> If you just want to solve Viveks problem, then its good enough. I.e. the
> file truncation turns the EPT entries into #VE convertible entries and
> the guest #VE handler can figure it out. This one can be injected
> directly by the hardware, i.e. you don't need a VMEXIT.
> 
> If you want the opportunistic do other stuff mechanism, then #VE has
> exactly the same problems as the existing async "PF". It's not magicaly
> making that go away.

You can inject #VE from the hypervisor too, with PV magic to distinguish
the two.  However that's not necessarily a good idea because it makes it
harder to switch to hardware delivery in the future.

> One possible solution might be to make all recoverable EPT entries
> convertible and let the HW inject #VE for those.
> 
> So the #VE handler in the guest would have to do:
> 
>        if (!recoverable()) {
>        	if (user_mode)
>                 	send_signal();
>                 else if (!fixup_exception())
>                 	die_hard();
>                 goto done;  
>        }                 
> 
>        store_ve_info_in_pv_page();
> 
>        if (!user_mode(regs) || !preemptible()) {
>        	hypercall_resolve_ept(can_continue = false);
>        } else {
>               init_completion();
>        	hypercall_resolve_ept(can_continue = true);
>               wait_for_completion();
>        }
> 
> or something like that.

Yes, pretty much.  The VE info can also be passed down to the hypercall
as arguments.

Paolo

> The hypercall to resolve the EPT fail on the host acts on the
> can_continue argument.
> 
> If false, it suspends the guest vCPU and only returns when done.
> 
> If true it kicks the resolve process and returns to the guest which
> suspends the task and tries to do something else.
> 
> The wakeup side needs to be a regular interrupt and cannot go through
> #VE.