lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 May 2022 14:49:15 +0000
From:   Jon Kohler <jon@...anix.com>
To:     Sean Christopherson <seanjc@...gle.com>
CC:     Jon Kohler <jon@...anix.com>, Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Balbir Singh <sblbir@...zon.com>,
        Kim Phillips <kim.phillips@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v3] x86/speculation, KVM: only IBPB for
 switch_mm_always_ibpb on vCPU load



> On May 10, 2022, at 10:22 AM, Sean Christopherson <seanjc@...gle.com> wrote:
> 
> On Sat, Apr 30, 2022, Jon Kohler wrote:
>> 
>>> On Apr 30, 2022, at 5:50 AM, Borislav Petkov <bp@...en8.de> wrote:
>>> So let me try to understand this use case: you have a guest and a bunch
>>> of vCPUs which belong to it. And that guest gets switched between those
>>> vCPUs and KVM does IBPB flushes between those vCPUs.
>>> 
>>> So either I'm missing something - which is possible - but if not, that
>>> "protection" doesn't make any sense - it is all within the same guest!
>>> So that existing behavior was silly to begin with so we might just as
>>> well kill it.
>> 
>> Close, its not 1 guest with a bunch of vCPU, its a bunch of guests with
>> a small amount of vCPUs, thats the small nuance here, which is one of 
>> the reasons why this was hard to see from the beginning. 
>> 
>> AFAIK, the KVM IBPB is avoided when switching in between vCPUs
>> belonging to the same vmcs/vmcb (i.e. the same guest), e.g. you could 
>> have one VM highly oversubscribed to the host and you wouldn’t see
>> either the KVM IBPB or the switch_mm IBPB. All good. 
> 
> No, KVM does not avoid IBPB when switching between vCPUs in a single VM.  Every
> vCPU has a separate VMCS/VMCB, and so the scenario described above where a single
> VM has a bunch of vCPUs running on a limited set of logical CPUs will emit IBPB
> on every single switch.

Ah! Right, ok thanks for helping me get my wires uncrossed there, I was getting
confused from the nested optimization made on
5c911beff KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02

So the only time we’d *not* issue IBPB is if the current per-vcpu vmcs/vmcb is
still loaded in the non-nested case, or between guests in the nested case.

Walking through my thoughts again here with this fresh in my mind:

In that example, say guest A has vCPU0 and vCPU1 and has to switch
in between the two on the same pCPU, it isn’t doing a switch_mm()
because its the same mm_struct; however, I’d wager to say that if you
had an attacker on the guest VM, executing an attack on vCPU0 with
the intent of attacking vCPU1 (which is up to run next), you have far
bigger problems, as that would imply the guest is completely
compromised, so why would they even waste time on a complex
prediction attack when they have that level of system access in
the first place?

Going back to the original commit documentation that Boris called out, 
that specifically says:

   * Mitigate guests from being attacked by other guests.
     - This is addressed by issing IBPB when we do a guest switch.

Since you need to go through switch_mm() to change mm_struct from
Guest A to Guest B, it makes no sense to issue the barrier in KVM,
as the kernel is already giving that “for free” (from KVM’s perspective) as
the guest-to-guest transition is already covered by cond_mitigation().

That would apply equally for switches within both nested and non-nested
cases, because switch_mm needs to be called between guests.

Powered by blists - more mailing lists