[an error occurred while processing this directive]
lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
[an error occurred while processing this directive]
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190925180931.GG31852@linux.intel.com>
Date:   Wed, 25 Sep 2019 11:09:31 -0700
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Fenghua Yu <fenghua.yu@...el.com>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Xiaoyao Li <xiaoyao.li@...el.com>,
        Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
        Ravi V Shankar <ravi.v.shankar@...el.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split
 lock

On Wed, Jun 26, 2019 at 11:47:40PM +0200, Thomas Gleixner wrote:
> So only one of the CPUs will win the cmpxchg race, set te variable to 1 and
> warn, the other and any subsequent AC on any other CPU will not warn
> either. So you don't need WARN_ONCE() at all. It's redundant and confusing
> along with the atomic_set().
> 
> Whithout reading that link [1], what Ingo proposed was surely not the
> trainwreck which you decided to put into that debugfs thing.

We're trying to sort out the trainwreck, but there's an additional wrinkle
that I'd like your input on.

We overlooked the fact that MSR_TEST_CTRL is per-core, i.e. shared by
sibling hyperthreads.  This is especially problematic for KVM, as loading
MSR_TEST_CTRL during VM-Enter could cause spurious #AC faults in the kernel
and bounce MSR_TEST_CTRL.split_lock.

E.g. if CPU0 and CPU1 are siblings and CPU1 is running a KVM guest with
MSR_TEST_CTRL.split_lock=1, hitting an #AC on CPU0 in the host kernel will
lead to suprious #AC faults and constant toggling of of the MSR.

  CPU0               CPU1

         split_lock=enabled

  #AC -> disabled

                     VM-Enter -> enabled

  #AC -> disabled

                     VM-Enter -> enabled

  #AC -> disabled



My thought to handle this:

  - Remove the per-cpu cache.

  - Rework the atomic variable to differentiate between "disabled globally"
    and "disabled by kernel (on some CPUs)".

  - Modify the #AC handler to test/set the same atomic variable as the
    sysfs knob.  This is the "disabled by kernel" flow.

  - Modify the debugfs/sysfs knob to only allow disabling split-lock
    detection.  This is the "disabled globally" path, i.e. sends IPIs to
    clear MSR_TEST_CTRL.split_lock on all online CPUs.

  - Modify the resume/init flow to clear MSR_TEST_CTRL.split_lock if it's
    been disabled on *any* CPU via #AC or via the knob.

  - Modify the debugs/sysfs read function to either print the raw atomic
    variable, or differentiate between "enabled", "disabled globally" and
   "disabled by kernel".

  - Remove KVM loading of MSR_TEST_CTRL, i.e. KVM *never* writes the CPU's
    actual MSR_TEST_CTRL.  KVM still emulates MSR_TEST_CTRL so that the
    guest can do WRMSR and handle its own #AC faults, but KVM doesn't
    change the value in hardware.

      * Allowing guest to enable split-lock detection can induce #AC on
        the host after it has been explicitly turned off, e.g. the sibling
        hyperthread hits an #AC in the host kernel, or worse, causes a
        different process in the host to SIGBUS.

      * Allowing guest to disable split-lock detection opens up the host
        to DoS attacks.

  - KVM advertises split-lock detection to guest/userspace if and only if
    split_lock_detect_disabled is zero.

  - Add a pr_warn_once() in KVM that triggers if split locks are disabled
    after support has been advertised to a guest.

Does this sound sane?

The question at the forefront of my mind is: why not have the #AC handler
send a fire-and-forget IPI to online CPUs to disable split-lock detection
on all CPUs?  Would the IPI be problematic?  Globally disabling split-lock
on any #AC would (marginally) simplify the code and would eliminate the
oddity of userspace process (and KVM guest) #AC behavior varying based on
the physical CPU it's running on.


Something like:

#define SPLIT_LOCK_DISABLED_IN_KERNEL	BIT(0)
#define SPLIT_LOCK_DISABLED_GLOBALLY	BIT(1)

static atomic_t split_lock_detect_disabled = ATOMIT_INIT(0);

void split_lock_detect_ac(void)
{
	lockdep_assert_irqs_disabled();

	/* Disable split lock detection on this CPU to avoid reentrant #AC. */
	wrmsrl(MSR_TEST_CTRL,
	       rdmsrl(MSR_TEST_CTRL) & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT);

	/*
	 * If split-lock detection has not been disabled, either by the kernel
	 * or globally, record that it has been disabled by the kernel and
	 * WARN.  Guarding WARN with the atomic ensures only the first #AC due
	 * to split-lock is logged, e.g. if multiple CPUs encounter #AC or if
	 * #AC is retriggered by a perf context NMI that interrupts the
	 * original WARN.
	 */
	if (atomic_cmpxchg(&split_lock_detect_disabled, 0,
			   SPLIT_LOCK_DISABLED_IN_KERNEL) == 0)
	        WARN(1, "split lock operation detected\n");
}

static ssize_t split_lock_detect_wr(struct file *f, const char __user *user_buf,
				    size_t count, loff_t *ppos)
{
	int old;

	<parse or ignore input value?>
	
	old = atomic_fetch_or(SPLIT_LOCK_DISABLED_GLOBALLY,
			      &split_lock_detect_disabled);

	/* Update MSR_TEST_CTRL unless split-lock was already disabled. */
	if (!(old & SPLIT_LOCK_DISABLED_GLOBALLY))
		on_each_cpu(split_lock_update, NULL, 1);

	return count;
}