[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <763bb046-e016-9440-55c4-33438e35e436@intel.com>
Date: Fri, 18 Oct 2019 18:20:44 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Radim Krcmar <rkrcmar@...hat.com>,
Ashok Raj <ashok.raj@...el.com>,
Tony Luck <tony.luck@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [RFD] x86/split_lock: Request to Intel
On 10/18/2019 5:02 PM, Thomas Gleixner wrote:
> On Fri, 18 Oct 2019, Xiaoyao Li wrote:
>> On 10/17/2019 8:29 PM, Thomas Gleixner wrote:
>>> The more I look at this trainwreck, the less interested I am in merging any
>>> of this at all.
>>>
>>> The fact that it took Intel more than a year to figure out that the MSR is
>>> per core and not per thread is yet another proof that this industry just
>>> works by pure chance.
>>>
>>
>> Whether it's per-core or per-thread doesn't affect much how we implement for
>> host/native.
>
> How useful.
OK. IIUC. We can agree on the use model of native like below:
We enable #AC on all cores/threads to detect split lock.
-If user space causes #AC, sending SIGBUS to it.
-If kernel causes #AC, we globally disable #AC on all cores/threads,
letting kernel go on working and WARN. (only disabling #AC on the thread
generates it just doesn't help, since the buggy kernel code is possible
to run on any threads and thus disabling #AC on all of them)
As described above, either enabled globally or disabled globally, so
whether it's per-core or per-thread really doesn't matter
>> And also, no matter it's per-core or per-thread, we always can do something in
>> VIRT.
>
> It matters a lot. If it would be per thread then we would not have this
> discussion at all.
Indeed, it's the fact that the control MSR bit is per-core to cause this
discussion. But the per-core scope only makes this feature difficult or
impossible to be virtualized.
We could make the decision to not expose it to guest to avoid the really
bad thing. However, even we don't expose this feature to guest and don't
virtualize it, the below problem always here.
If you think it's not a problem and acceptable to add an option to let
KVM disable host's #AC detection, we can just make it this way. And then
we can design the virtualizaion part without any change to native design
at all.
>> Maybe what matters is below.
>>
>>> Seriously, this makes only sense when it's by default enabled and not
>>> rendered useless by VIRT. Otherwise we never get any reports and none of
>>> the issues are going to be fixed.
>>>
>>
>> For VIRT, it doesn't want old guest to be killed due to #AC. But for native,
>> it doesn't want VIRT to disable the #AC detection
>>
>> I think it's just about the default behavior that whether to disable the
>> host's #AC detection or kill the guest (SIGBUS or something else) once there
>> is an split-lock #AC in guest.
>>
>> So we can provide CONFIG option to set the default behavior and module
>> parameter to let KVM set/change the default behavior.
>
> Care to read through the whole discussion and figure out WHY it's not that
> simple?
>
> Thanks,
>
> tglx
>
Powered by blists - more mailing lists