[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <C39CD5E4-3705-4D1A-A67D-43CBB7D1950B@nutanix.com>
Date: Thu, 12 May 2022 20:33:43 +0000
From: Jon Kohler <jon@...anix.com>
To: Jim Mattson <jmattson@...gle.com>
CC: Sean Christopherson <seanjc@...gle.com>,
Jon Kohler <jon@...anix.com>, Jonathan Corbet <corbet@....net>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Kees Cook <keescook@...omium.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Kim Phillips <kim.phillips@....com>,
Lukas Bulwahn <lukas.bulwahn@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Ashok Raj <ashok.raj@...el.com>,
KarimAllah Ahmed <karahmed@...zon.de>,
David Woodhouse <dwmw@...zon.co.uk>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
"kvm @ vger . kernel . org" <kvm@...r.kernel.org>,
Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v4] x86/speculation, KVM: remove IBPB on vCPU load
> On May 12, 2022, at 4:27 PM, Jim Mattson <jmattson@...gle.com> wrote:
>
> On Thu, May 12, 2022 at 1:07 PM Sean Christopherson <seanjc@...gle.com> wrote:
>>
>> On Thu, May 12, 2022, Jon Kohler wrote:
>>>
>>>
>>>> On May 12, 2022, at 3:35 PM, Sean Christopherson <seanjc@...gle.com> wrote:
>>>>
>>>> On Thu, May 12, 2022, Sean Christopherson wrote:
>>>>> On Thu, May 12, 2022, Jon Kohler wrote:
>>>>>> Remove IBPB that is done on KVM vCPU load, as the guest-to-guest
>>>>>> attack surface is already covered by switch_mm_irqs_off() ->
>>>>>> cond_mitigation().
>>>>>>
>>>>>> The original commit 15d45071523d ("KVM/x86: Add IBPB support") was simply
>>>>>> wrong in its guest-to-guest design intention. There are three scenarios
>>>>>> at play here:
>>>>>
>>>>> Jim pointed offline that there's a case we didn't consider. When switching between
>>>>> vCPUs in the same VM, an IBPB may be warranted as the tasks in the VM may be in
>>>>> different security domains. E.g. the guest will not get a notification that vCPU0 is
>>>>> being swapped out for vCPU1 on a single pCPU.
>>>>>
>>>>> So, sadly, after all that, I think the IBPB needs to stay. But the documentation
>>>>> most definitely needs to be updated.
>>>>>
>>>>> A per-VM capability to skip the IBPB may be warranted, e.g. for container-like
>>>>> use cases where a single VM is running a single workload.
>>>>
>>>> Ah, actually, the IBPB can be skipped if the vCPUs have different mm_structs,
>>>> because then the IBPB is fully redundant with respect to any IBPB performed by
>>>> switch_mm_irqs_off(). Hrm, though it might need a KVM or per-VM knob, e.g. just
>>>> because the VMM doesn't want IBPB doesn't mean the guest doesn't want IBPB.
>>>>
>>>> That would also sidestep the largely theoretical question of whether vCPUs from
>>>> different VMs but the same address space are in the same security domain. It doesn't
>>>> matter, because even if they are in the same domain, KVM still needs to do IBPB.
>>>
>>> So should we go back to the earlier approach where we have it be only
>>> IBPB on always_ibpb? Or what?
>>>
>>> At minimum, we need to fix the unilateral-ness of all of this :) since we’re
>>> IBPB’ing even when the user did not explicitly tell us to.
>>
>> I think we need separate controls for the guest. E.g. if the userspace VMM is
>> sufficiently hardened then it can run without "do IBPB" flag, but that doesn't
>> mean that the entire guest it's running is sufficiently hardened.
>>
>>> That said, since I just re-read the documentation today, it does specifically
>>> suggest that if the guest wants to protect *itself* it should turn on IBPB or
>>> STIBP (or other mitigations galore), so I think we end up having to think
>>> about what our “contract” is with users who host their workloads on
>>> KVM - are they expecting us to protect them in any/all cases?
>>>
>>> Said another way, the internal guest areas of concern aren’t something
>>> the kernel would always be able to A) identify far in advance and B)
>>> always solve on the users behalf. There is an argument to be made
>>> that the guest needs to deal with its own house, yea?
>>
>> The issue is that the guest won't get a notification if vCPU0 is replaced with
>> vCPU1 on the same physical CPU, thus the guest doesn't get an opportunity to emit
>> IBPB. Since the host doesn't know whether or not the guest wants IBPB, unless the
>> owner of the host is also the owner of the guest workload, the safe approach is to
>> assume the guest is vulnerable.
>
> Exactly. And if the guest has used taskset as its mitigation strategy,
> how is the host to know?
Yea thats fair enough. I posed a solution on Sean’s response just as this email
came in, would love to know your thoughts (keying off MSR bitmap).
Powered by blists - more mailing lists