[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210610084041.42a73peuwy7ivyt4@spock.localdomain>
Date: Thu, 10 Jun 2021 10:40:41 +0200
From: Oleksandr Natalenko <oleksandr@...alenko.name>
To: Toralf Förster <toralf.foerster@....de>
Cc: Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: repeating [Hardware Error]: Corrected error, no action required.
Hello.
On Wed, Jun 09, 2021 at 08:27:26PM +0200, Toralf Förster wrote:
> My syslog messages show at a hardened Gentoo
>
> # uname -a
> Linux mr-fox 5.12.9 #8 SMP Thu Jun 3 17:59:32 CEST 2021 x86_64 AMD Ryzen
> 9 5950X 16-Core Processor AuthenticAMD GNU/Linux
> mr-fox ~ #
>
> repeating entries every 5 mins like (always same address
> 0x000000031fb566e0):
>
> Jun 9 16:21:24 mr-fox kernel: mce: [Hardware Error]: Machine check
> events logged
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: Corrected error, no
> action required.
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: CPU:0 (19:21:0)
> MC17_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000000011b
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: Error Addr:
> 0x000000031fb566e0
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: IPID:
> 0x0000009600050f00, Syndrome: 0x33fa01000a800101
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: Unified Memory
> Controller Ext. Error Code: 0, DRAM ECC error.
> Jun 9 16:21:24 mr-fox kernel: EDAC MC0: 1 CE on mc#0csrow#1channel#0
> (csrow:1 channel:0 page:0xcaed59 offset:0x8e0 grain:64 syndrome:0x100)
> Jun 9 16:21:24 mr-fox kernel: [Hardware Error]: cache level: L3/GEN,
> tx: GEN, mem-tx: RD
>
>
> A hw mem check by Hetzner didn't found anything.
Did they run memtest in a loop for 10 times at least?
> May I asked whether I sahll worry about or not ?
If the reported page is indeed the same, then probably yes, you should
worry.
--
Oleksandr Natalenko (post-factum)
Powered by blists - more mailing lists