[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN6PR12MB263968E448E4CCD4C6856288F8AC0@SN6PR12MB2639.namprd12.prod.outlook.com>
Date: Thu, 15 Aug 2019 20:08:39 +0000
From: "Ghannam, Yazen" <Yazen.Ghannam@....com>
To: Borislav Petkov <bp@...en8.de>
CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 0/7] AMD64 EDAC fixes
> -----Original Message-----
> From: linux-edac-owner@...r.kernel.org <linux-edac-owner@...r.kernel.org> On Behalf Of Borislav Petkov
> Sent: Friday, August 2, 2019 9:46 AM
> To: Ghannam, Yazen <Yazen.Ghannam@....com>
> Cc: linux-edac@...r.kernel.org; linux-kernel@...r.kernel.org
> Subject: Re: [PATCH v2 0/7] AMD64 EDAC fixes
>
...
>
> So this still has this confusing reporting of unpopulated nodes:
>
> [ 4.291774] EDAC MC1: Giving out device to module amd64_edac controller F17h: DEV 0000:00:19.3 (INTERRUPT)
> [ 4.292021] EDAC DEBUG: ecc_enabled: Node 2: No enabled UMCs.
> [ 4.292231] EDAC amd64: Node 2: DRAM ECC disabled.
> [ 4.292405] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
> [ 4.292859] EDAC DEBUG: ecc_enabled: Node 3: No enabled UMCs.
> [ 4.292963] EDAC amd64: Node 3: DRAM ECC disabled.
> [ 4.293063] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
> [ 4.293347] AMD64 EDAC driver v3.5.0
>
> which needs fixing.
>
Yes, I agree. I was planning to do a fix in a separate set. Is that okay? Or should I add it here?
> Regardless, still not good enough. The snowy owl box I have here has 16
> GB:
>
> $ head -n1 /proc/meminfo
> MemTotal: 15715328 kB
>
> and yet
>
> [ 4.282251] EDAC MC: UMC0 chip selects:
> [ 4.282348] EDAC DEBUG: f17_addr_mask_to_cs_size: CS0 DIMM0 AddrMasks:
> [ 4.282455] EDAC DEBUG: f17_addr_mask_to_cs_size: Original AddrMask: 0x1fffffe
> [ 4.282592] EDAC DEBUG: f17_addr_mask_to_cs_size: Deinterleaved AddrMask: 0x1fffffe
> [ 4.282732] EDAC DEBUG: f17_addr_mask_to_cs_size: CS1 DIMM0 AddrMasks:
> [ 4.282839] EDAC DEBUG: f17_addr_mask_to_cs_size: Original AddrMask: 0x1fffffe
> [ 4.283060] EDAC DEBUG: f17_addr_mask_to_cs_size: Deinterleaved AddrMask: 0x1fffffe
> [ 4.283286] EDAC amd64: MC: 0: 8191MB 1: 8191MB
> ^^^^^^^^^^^^^^^^^
>
> [ 4.283456] EDAC amd64: MC: 2: 0MB 3: 0MB
>
> ...
>
> [ 4.285379] EDAC MC: UMC1 chip selects:
> [ 4.285476] EDAC DEBUG: f17_addr_mask_to_cs_size: CS0 DIMM0 AddrMasks:
> [ 4.285583] EDAC DEBUG: f17_addr_mask_to_cs_size: Original AddrMask: 0x1fffffe
> [ 4.285721] EDAC DEBUG: f17_addr_mask_to_cs_size: Deinterleaved AddrMask: 0x1fffffe
> [ 4.285860] EDAC DEBUG: f17_addr_mask_to_cs_size: CS1 DIMM0 AddrMasks:
> [ 4.285967] EDAC DEBUG: f17_addr_mask_to_cs_size: Original AddrMask: 0x1fffffe
> [ 4.286105] EDAC DEBUG: f17_addr_mask_to_cs_size: Deinterleaved AddrMask: 0x1fffffe
> [ 4.286244] EDAC amd64: MC: 0: 8191MB 1: 8191MB
> ^^^^^^^^^^^^^^^^^
>
> [ 4.286345] EDAC amd64: MC: 2: 0MB 3: 0MB
>
> which shows 4 chip selects x 8Gb = 32G.
>
> So something's still wrong. Before the patchset it says:
>
> EDAC MC: UMC0 chip selects:
> EDAC amd64: MC: 0: 8192MB 1: 0MB
> ...
> EDAC MC: UMC1 chip selects:
> EDAC amd64: MC: 0: 8192MB 1: 0MB
>
> which is the correct output.
>
Can you please send me the full kernel log and dmidecode output?
Thanks,
Yazen
Powered by blists - more mailing lists