linux-kernel - Re: [PATCH 2/2] KVM: VMX: Extend VMX's #AC handding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200204060353.GB31665@linux.intel.com>
Date:   Mon, 3 Feb 2020 22:03:53 -0800
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Xiaoyao Li <xiaoyao.li@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        kvm list <kvm@...r.kernel.org>
Subject: Re: [PATCH 2/2] KVM: VMX: Extend VMX's #AC handding

On Mon, Feb 03, 2020 at 10:49:50AM -0800, Andy Lutomirski wrote:
> On Sat, Feb 1, 2020 at 8:33 PM Xiaoyao Li <xiaoyao.li@...el.com> wrote:
> >
> > On 2/2/2020 1:56 AM, Andy Lutomirski wrote:
> > >
> > >
> > > There are two independent problems here.  First, SLD *can’t* be
> > > virtualized sanely because it’s per-core not per-thread.
> >
> > Sadly, it's the fact we cannot change. So it's better virtualized only when
> > SMT is disabled to make thing simple.
> >
> > > Second, most users *won’t want* to virtualize it correctly even if they
> > > could: if a guest is allowed to do split locks, it can DoS the system.
> >
> > To avoid DoS attack, it must use sld_fatal mode. In this case, guest are
> > forbidden to do split locks.
> >
> > > So I think there should be an architectural way to tell a guest that SLD
> > > is on whether it likes it or not. And the guest, if booted with sld=warn,
> > > can print a message saying “haha, actually SLD is fatal” and carry on.

Ya, but to make it architectural it needs to be actual hardware behavior.
I highly doubt we can get explicit documentation in the SDM regarding the
behavior of a hypervisor.  E.g. the "official" hypervisor CPUID bit and
CPUID range is documented in the SDM as simply being reserved until the
end of time.  Getting a bit reserved in the SDM does us no good as VMM
handling of the bit would still be determined by convention.

But, getting something in the SDM would serve our purposes, even if it's
too late to get it implemented for the first CPUs that support SLD.  It
would, in theory, require kernels to be prepared to handle a sticky SLD
bit and define a common way for VMMs to virtualize the behavior.

A sticky/lock bit in the MSR is probably the easiest to implement in
ucode?  That'd be easy for KVM to emulate, and then the kernel init code
would look something like:

static void split_lock_init(void)
{
        u64 test_ctrl_val;

        if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val)) {
		sld_state = sld_off;
                return;
	}

        if (sld_state != sld_fatal &&
            (test_ctrl_val & MSR_TEST_CTRL_LOCK_DETECT) &&
            (test_ctrl_val & MSR_TEST_CTRL_LOCK_DETECT_STICKY)) {
		pr_crit("haha, actually SLD is fatal\n");
                sld_state = std_fatal;
                return;
        }

	...
}

> > OK. Let me sort it out.
> >
> > If SMT is disabled/unsupported, so KVM advertises SLD feature to guest.
> > below are all the case:
> >
> > -----------------------------------------------------------------------
> > Host    Guest   Guest behavior
> > -----------------------------------------------------------------------
> > 1. off          same as in bare metal
> > -----------------------------------------------------------------------
> > 2. warn off     allow guest do split lock (for old guest):
> >                 hardware bit set initially, once split lock
> >                 happens, clear hardware bit when vcpu is running
> >                 So, it's the same as in bare metal
> >
> > 3.      warn    1. user space: get #AC, then clear MSR bit, but
> >                    hardware bit is not cleared, #AC again, finally
> >                    clear hardware bit when vcpu is running.
> >                    So it's somehow the same as in bare-metal
> 
> Well, kind of, except that the warning is inaccurate -- there is no
> guarantee that the hardware bit will be set at all when the guest is
> running.  This doesn't sound *that* bad, but it does mean that the
> guest doesn't get the degree of DoS protection it thinks it's getting.
> 
> My inclination is that, the host mode is warn, then SLD should not be
> exposed to the guest at all and the host should fully handle it.

KVM can expose it to the guest.  KVM just needs to ensure SLD is turned
on prior to VM-Enter with vcpu->msr_test_ctrl.sld=1, which is easy enough.

> >                 2. kernel: same as in bare metal.
> >
> > 4.      fatal   same as in bare metal
> > ----------------------------------------------------------------------
> > 5.fatal off     guest is killed when split lock,
> >                 or forward #AC to guest, this way guest gets an
> >                 unexpected #AC
> 
> Killing the guest seems like the right choice.  But see below -- this
> is not ideal if the guest is new.
> 
> >
> > 6.      warn    1. user space: get #AC, then clear MSR bit, but
> >                    hardware bit is not cleared, #AC again,
> >                    finally guest is killed, or KVM forwards #AC
> >                    to guest then guest gets an unexpected #AC.
> >                 2. kernel: same as in bare metal, call die();
> >
> > 7.      fatal   same as in bare metal
> > ----------------------------------------------------------------------
> >
> > Based on the table above, if we want guest has same behavior as in bare
> > metal, we can set host to sld_warn mode.
> 
> I don't think this is correct.  If the host is in warn mode, then the
> guest behavior will be erratic.  I'm not sure it makes sense for KVM
> to expose such erratic behavior to the guest.

It's doable without introducing non-architectural behavior and without too
much pain on KVM's end.

https://lkml.kernel.org/r/20200204053552.GA31665@linux.intel.com

> > If we want prevent DoS from guest, we should set host to sld_fatal mode.
> >
> >
> > Now, let's analysis what if there is an architectural way to tell a
> > guest that SLD is forced on. Assume it's a SLD_forced_on cpuid bit.
> >
> > - Host is sld_off, SLD_forced_on cpuid bit is not set, no change for
> >    case #1