[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2714c370-d3ba-4522-a7ec-be30186181f0@molgen.mpg.de>
Date: Fri, 27 Jan 2017 14:35:16 +0100
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Ashok Raj <ashok.raj@...el.com>
Cc: Borislav Petkov <bp@...en8.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thorsten Leemhuis <linux@...mhuis.info>,
Len Brown <len.brown@...el.com>,
Tony Luck <tony.luck@...el.com>,
Mario Limonciello <mario.limonciello@...l.com>,
Thorsten Leemhuis <linux@...mhuis.info>
Subject: Re: Dell XPS13: MCE (Hardware Error) reported
Dear Ashok,
On 01/09/17 20:23, Raj, Ashok wrote:
> On Mon, Jan 09, 2017 at 12:53:33PM +0100, Paul Menzel wrote:
>
>> On 01/05/17 02:12, Raj, Ashok wrote:
>>
>>>>> CPUID Vendor Intel Family 6 Model 142
>>> This is Kabylake Mobile
>>>
>>>>> Hardware event. This is not a software error.
>>>>> MCE 1
>>>>> CPU 0 BANK 7
>>>>> MISC 7880018086 ADDR fef1ce40
>>>>> TIME 1483543069 Wed Jan 4 16:17:49 2017
>
>>>>> STATUS ee0000000040110a MCGSTATUS 0
>>>
>>> Decoding the bits further from MCi_STATUS above:
>>> Val=1, OVER=1, UC=1, but EN=0 indicates this isn't a MCE, hence should have
>>> been signaled by a CMCI.
>>>
>>> PCC=1, but should be ignored when EN=0.
>>> MCACOD: 110a MSCOD: 0040
>
> This MSCOD indicates that its a write back access to mmio space. Its possible
> that BIOS is scanning certain memory region during boot. During which time
> BIOS does disable generation of MCE's. Which is why EN=0 in the above log.
>
> Its a BIOS bug, one would expect that BIOS clears up these before handoff to
> OS. During OS boot we also scan all MC banks and log/clear them.
>
> If you aren't observing them during normal operation you can safely ignore
> these preboot logs, or pass them along to your OEM.
Thank you very much for your help. After wasting my time with the Dell
support over Twitter [1], where they basically also make you jump
through hoops, and then claim it’s an mcelog issue – as they apparently
only execute `sudo mcelog` –, I updated to the latest firmware 1.3.2
released yesterday [2].
With that new firmware version, it looks like that the firmware has been
fixed and Linux does not report any MCEs.
It’d be great if other Dell XPS13 9360 users could verify that.
Kind regards,
Paul
[1] https://twitter.com/pmenzel_molgen/status/818808708692115456
[2] XPS_9360_1.3.2.exe
Powered by blists - more mailing lists