linux-kernel - Re: [PATCH v3] x86/speculation, KVM: only IBPB for switch_mm_always

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Ynp6ZoQUwtlWPI0Z@google.com>
Date:   Tue, 10 May 2022 14:44:54 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Jon Kohler <jon@...anix.com>, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Balbir Singh <sblbir@...zon.com>,
        Kim Phillips <kim.phillips@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v3] x86/speculation, KVM: only IBPB for
 switch_mm_always_ibpb on vCPU load

On Sat, Apr 30, 2022, Borislav Petkov wrote:
> On Sat, Apr 30, 2022 at 02:50:35PM +0000, Jon Kohler wrote:
> > This is 100% a fair ask, I appreciate the diligence, as we’ve all been there
> > on the ‘other side’ of changes to complex areas and spend hours digging on
> > git history, LKML threads, SDM/APM, and other sources trying to derive
> > why the heck something is the way it is.
> 
> Yap, that's basically proving my point and why I want stuff to be
> properly documented so that the question "why was it done this way" can
> always be answered satisfactorily.
> 
> > AFAIK, the KVM IBPB is avoided when switching in between vCPUs
> > belonging to the same vmcs/vmcb (i.e. the same guest), e.g. you could 
> > have one VM highly oversubscribed to the host and you wouldn’t see
> > either the KVM IBPB or the switch_mm IBPB. All good. 
> > 
> > Reference vmx_vcpu_load_vmcs() and svm_vcpu_load() and the 
> > conditionals prior to the barrier.
> 
> So this is where something's still missing.
> 
> > However, the pain ramps up when you have a bunch of separate guests,
> > especially with a small amount of vCPUs per guest, so the switching is more
> > likely to be in between completely separate guests.
> 
> If the guests are completely separate, then it should fall into the
> switch_mm() case.
> 
> Unless it has something to do with, as I looked at the SVM side of
> things, the VMCBs:
> 
> 	if (sd->current_vmcb != svm->vmcb) {
> 
> So it is not only different guests but also within the same guest and
> when the VMCB of the vCPU is not the current one.

Yep.

> But then if VMCB of the vCPU is not the current, per-CPU VMCB, then that
> CPU ran another guest so in order for that other guest to attack the
> current guest, then its branch pred should be flushed.

That CPU ran a different _vCPU_, whether or not it ran a different guest, i.e. a
different security domain, is unknown.

> But I'm likely missing a virt aspect here so I'd let Sean explain what
> the rules are...

I don't think you're missing anything.  I think the original 15d45071523d ("KVM/x86:
Add IBPB support") was simply wrong.

As I see it:

  1. If the vCPUs belong to the same VM, they are in the same security domain and
     do not need an IPBP.

  2. If the vCPUs belong to different VMs, and each VM is in its own mm_struct,
     defer to switch_mm_irqs_off() to handle IBPB as an mm switch is guaranteed to
     occur prior to loading a vCPU belonging to a different VMs.
 
  3. If the vCPUs belong to different VMs, but multiple VMs share an mm_struct,
     then the security benefits of an IBPB when switching vCPUs are dubious, at best.

If we only consider #1 and #2, then KVM doesn't need an IBPB, period.

#3 is the only one that's a grey area.  I have no objection to omitting IBPB entirely
even in that case, because none of us can identify any tangible value in doing so.