lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <492a0fe6-ca0b-ca97-27bf-e6407c60469c@alibaba-inc.com>
Date:   Fri, 29 Dec 2017 21:17:34 +0800
From:   "Jia Zhang" <qianyue.zj@...baba-inc.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     bp@...en8.de, tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
        x86@...nel.org, linux-kernel@...r.kernel.org, tony.luck@...el.com
Subject: Re: [PATCH v4] x86/microcode/intel: Blacklist the specific BDW-EP for
 late loading



在 2017/12/29 下午8:48, Ingo Molnar 写道:
> 
> * Jia Zhang <qianyue.zj@...baba-inc.com> wrote:
> 
>>
>>
>> 在 2017/12/28 下午8:24, Ingo Molnar 写道:
>>>
>>> * Jia Zhang <qianyue.zj@...baba-inc.com> wrote:
>>>
>>>> Instead of blacklisting all types of Broadwell processor when running
>>>> a late loading, only BDW-EP (signature 0x406f1, aka family 6, model 79,
>>>> stepping 1) with the microcode version less than 0x0b000021 needs to
>>>> be blacklisted.
>>>>
>>>> The erratum is documented in the the public documentation #334165 (See
>>>> the item BDF90 for details).
>>>>
>>>> Signed-off-by: Jia Zhang <qianyue.zj@...baba-inc.com>
>>>> ---
>>>>  arch/x86/kernel/cpu/microcode/intel.c | 12 ++++++++++--
>>>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
>>>> index 8ccdca6..79cad85 100644
>>>> --- a/arch/x86/kernel/cpu/microcode/intel.c
>>>> +++ b/arch/x86/kernel/cpu/microcode/intel.c
>>>> @@ -910,8 +910,16 @@ static bool is_blacklisted(unsigned int cpu)
>>>>  {
>>>>  	struct cpuinfo_x86 *c = &cpu_data(cpu);
>>>>  
>>>> -	if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
>>>> -		pr_err_once("late loading on model 79 is disabled.\n");
>>>> +	/*
>>>> +	 * The Broadwell-EP processor with the microcode version less
>>>> +	 * then 0x0b000021 may result in system hang when running a late
>>>> +	 * loading. This behavior is documented in item BDF90, #334165
>>>> +	 * (Intel Xeon Processor E7-8800/4800 v4 Product Family).
>>>> +	 */
>>>> +	if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
>>>> +	    c->x86_mask == 0x01 && c->microcode < 0x0b000021) {
>>>> +		pr_err_once("late loading on cpu (sig 0x406f1) is disabled "
>>>> +			    "due to erratum causing system hang.\n");
>>>
>>> Please never break user-readable messages mid-sentence!
>>>
>>> This should be something like:
>>>
>>> 		pr_err_once("Late loading of the CPU microcode (sig 0x406f1) is disabled due to Intel erratum BDF90 causing system hangs.\n");
>>>
>>> (note the spelling and readability improvements as well)
>>>
>>> Btw., what does 'sig 0x406f1' refer to?
>>
>> It is so-called processor signature which can be used to identify a
>> model of x86 processor uniquely. It's the return value of cpuid
>> instruction with leaf 1(eax == 1).
> 
> Ah, indeed, the (somewhat weird) encoding described in arch/x86/lib/cpu.c, which 
> is essentially family+model+stepping encoded into a single integer, right?

Totally correct.

> 
> That whole area needs a good cleanup to be less confusing (we refer to the CPU 
> stepping as x86_stepping(), but the field is called ->x86_mask?), but in the 

Yes. This is a confusing name. I will send another patch to clean up it.

> meanwhile, let's please make it more obvious in user facing message what's 
> happening.
> 
> Instead of using the microcode signature of the CPU model, please write out what's 
> going on:
> 
> 	pr_err_once("Not loading old microcode version: erratum BDF90 on Intel Broadwell-EP stepping 1 CPUs may cause system hangs.\n");
> 
> ... and please also tell the user what to do about it:
> 
> 	pr_err_once("Please update your microcode files.\n");

Let me give a full background and we will have a best description for
this erratum clearly.

If current processor signature matches the problematic Broadwell-EP
model (0x406f1) *AND* current version of microcode is less than
0x0b000021, launching a microcode update in Linux runtime (or so-called
late loading) must be prohibited in order to prevent from system hang
due to the erratum. Namely, the end user has to make a BIOS update to
uprev the microcode. The code of microcode update loader in BIOS can
safely issus an microcode update without the concern about this erratum.
This is the so-called manner of early loading.

Thanks,
Jia

> 
> !!
> 
> Agreed?
> 
> Thanks,
> 
> 	Ingo
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ