lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1910172207010.1869@nanos.tec.linutronix.de>
Date:   Thu, 17 Oct 2019 23:31:15 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Xiaoyao Li <xiaoyao.li@...el.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
        Ravi V Shankar <ravi.v.shankar@...el.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [RFD] x86/split_lock: Request to Intel

On Thu, 17 Oct 2019, Sean Christopherson wrote:
> On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote:
> > The more I look at this trainwreck, the less interested I am in merging any
> > of this at all.
> > 
> > The fact that it took Intel more than a year to figure out that the MSR is
> > per core and not per thread is yet another proof that this industry just
> > works by pure chance.
> > 
> > There is a simple way out of this misery:
> > 
> >   Intel issues a microcode update which does:
> > 
> >     1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to
> >        AND logic, i.e. when one thread disables AC it's automatically
> >        disabled on the core.
> > 
> >        Alternatively it supresses the #AC when the current thread has it
> >        disabled.
> > 
> >     2) Provide a separate bit which indicates that the AC enable logic is
> >        actually AND based or that #AC is supressed when the current thread
> >        has it disabled.
> > 
> >     Which way I don't really care as long as it makes sense.
> 
> The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on
> one CPU are immediately visible on its sibling CPU.

That's less horrible than I read out of your initial explanation.

Thankfully all of this is meticulously documented in the SDM ...

Though it changes the picture radically. The truly shared MSR allows
regular software synchronization without IPIs and without an insane amount
of corner case handling.

So as you pointed out we need a per core state, which is influenced by:

 1) The global enablement switch

 2) Host induced #AC

 3) Guest induced #AC

    A) Guest has #AC handling

    B) Guest has no #AC handling

#1:

   - OFF: #AC is globally disabled

   - ON:  #AC is globally enabled

   - FORCE: same as ON but #AC is enforced on guests

#2:

   If the host triggers an #AC then the #AC has to be force disabled on the
   affected core independent of the state of #1. Nothing we can do about
   that and once the initial wave of #AC issues is fixed this should not
   happen on production systems. That disables #3 even for the #3.A case
   for simplicity sake.

#3:

   A) Guest has #AC handling
    
      #AC is forwarded to the guest. No further action required aside of
      accounting

   B) Guest has no #AC handling

      If #AC triggers the resulting action depends on the state of #1:

      	 - FORCE: Guest is killed with SIGBUS or whatever the virt crowd
	   	  thinks is the appropriate solution

         - ON: #AC triggered state is recorded per vCPU and the MSR is
	   	toggled on VMENTER/VMEXIT in software from that point on.

So the only interesting case is #3.B and #1.state == ON. There you need
serialization of the state and the MSR write between the cores, but only
when the vCPU triggered an #AC. Until then, nothing to do.

vmenter()
{
	if (vcpu->ac_disable)
		this_core_disable_ac();
}

vmexit()
{
	if (vcpu->ac_disable) {
		this_core_enable_ac();
}

this_core_dis/enable_ac() takes the global state into account and has the
necessary serialization in place.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ