[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc8f5850-ba7d-45b7-a30b-5560764edcc8@intel.com>
Date: Wed, 16 Oct 2019 14:58:05 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Sean Christopherson <sean.j.christopherson@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Fenghua Yu <fenghua.yu@...el.com>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krcmar <rkrcmar@...hat.com>,
Ashok Raj <ashok.raj@...el.com>,
Tony Luck <tony.luck@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split
lock
On 9/26/2019 2:09 AM, Sean Christopherson wrote:
> On Wed, Jun 26, 2019 at 11:47:40PM +0200, Thomas Gleixner wrote:
>> So only one of the CPUs will win the cmpxchg race, set te variable to 1 and
>> warn, the other and any subsequent AC on any other CPU will not warn
>> either. So you don't need WARN_ONCE() at all. It's redundant and confusing
>> along with the atomic_set().
>>
>> Whithout reading that link [1], what Ingo proposed was surely not the
>> trainwreck which you decided to put into that debugfs thing.
>
> We're trying to sort out the trainwreck, but there's an additional wrinkle
> that I'd like your input on.
>
> We overlooked the fact that MSR_TEST_CTRL is per-core, i.e. shared by
> sibling hyperthreads. This is especially problematic for KVM, as loading
> MSR_TEST_CTRL during VM-Enter could cause spurious #AC faults in the kernel
> and bounce MSR_TEST_CTRL.split_lock.
>
> E.g. if CPU0 and CPU1 are siblings and CPU1 is running a KVM guest with
> MSR_TEST_CTRL.split_lock=1, hitting an #AC on CPU0 in the host kernel will
> lead to suprious #AC faults and constant toggling of of the MSR.
>
> CPU0 CPU1
>
> split_lock=enabled
>
> #AC -> disabled
>
> VM-Enter -> enabled
>
> #AC -> disabled
>
> VM-Enter -> enabled
>
> #AC -> disabled
>
>
>
> My thought to handle this:
>
> - Remove the per-cpu cache.
>
> - Rework the atomic variable to differentiate between "disabled globally"
> and "disabled by kernel (on some CPUs)".
>
> - Modify the #AC handler to test/set the same atomic variable as the
> sysfs knob. This is the "disabled by kernel" flow.
>
> - Modify the debugfs/sysfs knob to only allow disabling split-lock
> detection. This is the "disabled globally" path, i.e. sends IPIs to
> clear MSR_TEST_CTRL.split_lock on all online CPUs.
>
> - Modify the resume/init flow to clear MSR_TEST_CTRL.split_lock if it's
> been disabled on *any* CPU via #AC or via the knob.
>
> - Modify the debugs/sysfs read function to either print the raw atomic
> variable, or differentiate between "enabled", "disabled globally" and
> "disabled by kernel".
>
> - Remove KVM loading of MSR_TEST_CTRL, i.e. KVM *never* writes the CPU's
> actual MSR_TEST_CTRL. KVM still emulates MSR_TEST_CTRL so that the
> guest can do WRMSR and handle its own #AC faults, but KVM doesn't
> change the value in hardware.
>
> * Allowing guest to enable split-lock detection can induce #AC on
> the host after it has been explicitly turned off, e.g. the sibling
> hyperthread hits an #AC in the host kernel, or worse, causes a
> different process in the host to SIGBUS.
>
> * Allowing guest to disable split-lock detection opens up the host
> to DoS attacks.
>
> - KVM advertises split-lock detection to guest/userspace if and only if
> split_lock_detect_disabled is zero.
>
> - Add a pr_warn_once() in KVM that triggers if split locks are disabled
> after support has been advertised to a guest.
>
> Does this sound sane?
>
> The question at the forefront of my mind is: why not have the #AC handler
> send a fire-and-forget IPI to online CPUs to disable split-lock detection
> on all CPUs? Would the IPI be problematic? Globally disabling split-lock
> on any #AC would (marginally) simplify the code and would eliminate the
> oddity of userspace process (and KVM guest) #AC behavior varying based on
> the physical CPU it's running on.
>
>
> Something like:
>
> #define SPLIT_LOCK_DISABLED_IN_KERNEL BIT(0)
> #define SPLIT_LOCK_DISABLED_GLOBALLY BIT(1)
>
> static atomic_t split_lock_detect_disabled = ATOMIT_INIT(0);
>
> void split_lock_detect_ac(void)
> {
> lockdep_assert_irqs_disabled();
>
> /* Disable split lock detection on this CPU to avoid reentrant #AC. */
> wrmsrl(MSR_TEST_CTRL,
> rdmsrl(MSR_TEST_CTRL) & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
>
> /*
> * If split-lock detection has not been disabled, either by the kernel
> * or globally, record that it has been disabled by the kernel and
> * WARN. Guarding WARN with the atomic ensures only the first #AC due
> * to split-lock is logged, e.g. if multiple CPUs encounter #AC or if
> * #AC is retriggered by a perf context NMI that interrupts the
> * original WARN.
> */
> if (atomic_cmpxchg(&split_lock_detect_disabled, 0,
> SPLIT_LOCK_DISABLED_IN_KERNEL) == 0)
> WARN(1, "split lock operation detected\n");
> }
>
> static ssize_t split_lock_detect_wr(struct file *f, const char __user *user_buf,
> size_t count, loff_t *ppos)
> {
> int old;
>
> <parse or ignore input value?>
>
> old = atomic_fetch_or(SPLIT_LOCK_DISABLED_GLOBALLY,
> &split_lock_detect_disabled);
>
> /* Update MSR_TEST_CTRL unless split-lock was already disabled. */
> if (!(old & SPLIT_LOCK_DISABLED_GLOBALLY))
> on_each_cpu(split_lock_update, NULL, 1);
>
> return count;
> }
>
Hi Thomas,
Could you please have a look at Sean's proposal and give your opinion.
Powered by blists - more mailing lists