linux-kernel - Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc8f5850-ba7d-45b7-a30b-5560764edcc8@intel.com>
Date:   Wed, 16 Oct 2019 14:58:05 +0800
From:   Xiaoyao Li <xiaoyao.li@...el.com>
To:     Sean Christopherson <sean.j.christopherson@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Fenghua Yu <fenghua.yu@...el.com>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
        Ravi V Shankar <ravi.v.shankar@...el.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split
 lock

On 9/26/2019 2:09 AM, Sean Christopherson wrote:
> On Wed, Jun 26, 2019 at 11:47:40PM +0200, Thomas Gleixner wrote:
>> So only one of the CPUs will win the cmpxchg race, set te variable to 1 and
>> warn, the other and any subsequent AC on any other CPU will not warn
>> either. So you don't need WARN_ONCE() at all. It's redundant and confusing
>> along with the atomic_set().
>>
>> Whithout reading that link [1], what Ingo proposed was surely not the
>> trainwreck which you decided to put into that debugfs thing.
> 
> We're trying to sort out the trainwreck, but there's an additional wrinkle
> that I'd like your input on.
> 
> We overlooked the fact that MSR_TEST_CTRL is per-core, i.e. shared by
> sibling hyperthreads.  This is especially problematic for KVM, as loading
> MSR_TEST_CTRL during VM-Enter could cause spurious #AC faults in the kernel
> and bounce MSR_TEST_CTRL.split_lock.
> 
> E.g. if CPU0 and CPU1 are siblings and CPU1 is running a KVM guest with
> MSR_TEST_CTRL.split_lock=1, hitting an #AC on CPU0 in the host kernel will
> lead to suprious #AC faults and constant toggling of of the MSR.
> 
>    CPU0               CPU1
> 
>           split_lock=enabled
> 
>    #AC -> disabled
> 
>                       VM-Enter -> enabled
> 
>    #AC -> disabled
> 
>                       VM-Enter -> enabled
> 
>    #AC -> disabled
> 
> 
> 
> My thought to handle this:
> 
>    - Remove the per-cpu cache.
> 
>    - Rework the atomic variable to differentiate between "disabled globally"
>      and "disabled by kernel (on some CPUs)".
> 
>    - Modify the #AC handler to test/set the same atomic variable as the
>      sysfs knob.  This is the "disabled by kernel" flow.
> 
>    - Modify the debugfs/sysfs knob to only allow disabling split-lock
>      detection.  This is the "disabled globally" path, i.e. sends IPIs to
>      clear MSR_TEST_CTRL.split_lock on all online CPUs.
> 
>    - Modify the resume/init flow to clear MSR_TEST_CTRL.split_lock if it's
>      been disabled on *any* CPU via #AC or via the knob.
> 
>    - Modify the debugs/sysfs read function to either print the raw atomic
>      variable, or differentiate between "enabled", "disabled globally" and
>     "disabled by kernel".
> 
>    - Remove KVM loading of MSR_TEST_CTRL, i.e. KVM *never* writes the CPU's
>      actual MSR_TEST_CTRL.  KVM still emulates MSR_TEST_CTRL so that the
>      guest can do WRMSR and handle its own #AC faults, but KVM doesn't
>      change the value in hardware.
> 
>        * Allowing guest to enable split-lock detection can induce #AC on
>          the host after it has been explicitly turned off, e.g. the sibling
>          hyperthread hits an #AC in the host kernel, or worse, causes a
>          different process in the host to SIGBUS.
> 
>        * Allowing guest to disable split-lock detection opens up the host
>          to DoS attacks.
> 
>    - KVM advertises split-lock detection to guest/userspace if and only if
>      split_lock_detect_disabled is zero.
> 
>    - Add a pr_warn_once() in KVM that triggers if split locks are disabled
>      after support has been advertised to a guest.
> 
> Does this sound sane?
> 
> The question at the forefront of my mind is: why not have the #AC handler
> send a fire-and-forget IPI to online CPUs to disable split-lock detection
> on all CPUs?  Would the IPI be problematic?  Globally disabling split-lock
> on any #AC would (marginally) simplify the code and would eliminate the
> oddity of userspace process (and KVM guest) #AC behavior varying based on
> the physical CPU it's running on.
> 
> 
> Something like:
> 
> #define SPLIT_LOCK_DISABLED_IN_KERNEL	BIT(0)
> #define SPLIT_LOCK_DISABLED_GLOBALLY	BIT(1)
> 
> static atomic_t split_lock_detect_disabled = ATOMIT_INIT(0);
> 
> void split_lock_detect_ac(void)
> {
> 	lockdep_assert_irqs_disabled();
> 
> 	/* Disable split lock detection on this CPU to avoid reentrant #AC. */
> 	wrmsrl(MSR_TEST_CTRL,
> 	       rdmsrl(MSR_TEST_CTRL) & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
> 
> 	/*
> 	 * If split-lock detection has not been disabled, either by the kernel
> 	 * or globally, record that it has been disabled by the kernel and
> 	 * WARN.  Guarding WARN with the atomic ensures only the first #AC due
> 	 * to split-lock is logged, e.g. if multiple CPUs encounter #AC or if
> 	 * #AC is retriggered by a perf context NMI that interrupts the
> 	 * original WARN.
> 	 */
> 	if (atomic_cmpxchg(&split_lock_detect_disabled, 0,
> 			   SPLIT_LOCK_DISABLED_IN_KERNEL) == 0)
> 	        WARN(1, "split lock operation detected\n");
> }
> 
> static ssize_t split_lock_detect_wr(struct file *f, const char __user *user_buf,
> 				    size_t count, loff_t *ppos)
> {
> 	int old;
> 
> 	<parse or ignore input value?>
> 	
> 	old = atomic_fetch_or(SPLIT_LOCK_DISABLED_GLOBALLY,
> 			      &split_lock_detect_disabled);
> 
> 	/* Update MSR_TEST_CTRL unless split-lock was already disabled. */
> 	if (!(old & SPLIT_LOCK_DISABLED_GLOBALLY))
> 		on_each_cpu(split_lock_update, NULL, 1);
> 
> 	return count;
> }
> 

Hi Thomas,

Could you please have a look at Sean's proposal and give your opinion.