lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120430081513.GD8182@aftab.osrc.amd.com>
Date:	Mon, 30 Apr 2012 10:15:13 +0200
From:	Borislav Petkov <bp@...64.org>
To:	Mauro Carvalho Chehab <mchehab@...hat.com>
Cc:	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Aristeu Rozanski <arozansk@...hat.com>,
	Doug Thompson <norsk5@...oo.com>,
	Mark Gross <mark.gross@...el.com>,
	Jason Uhlenkott <juhlenko@...mai.com>,
	Tim Small <tim@...tersideup.com>,
	Ranganathan Desikan <ravi@...ztechnologies.com>,
	"Arvind R." <arvino55@...il.com>, Olof Johansson <olof@...om.net>,
	Egor Martovetsky <egor@...emi.com>,
	Chris Metcalf <cmetcalf@...era.com>,
	Michal Marek <mmarek@...e.cz>, Jiri Kosina <jkosina@...e.cz>,
	Joe Perches <joe@...ches.com>,
	Dmitry Eremin-Solenikov <dbaryshkov@...il.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Hitoshi Mitake <h.mitake@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Niklas Söderlund 
	<niklas.soderlund@...csson.com>,
	Shaohui Xie <Shaohui.Xie@...escale.com>,
	Josh Boyer <jwboyer@...il.com>, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH EDACv16 1/2] edac: Change internal representation to work
 with layers

On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
> > [   10.486440] EDAC MC: DCT0 chip selects:
> > [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
> > [   10.486455] EDAC MC: DCT1 chip selects:
> > [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486467] EDAC amd64: using x8 syndromes.
> > [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
> > [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
> > [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
> > [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
> > [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
> > [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
> > [   10.486488] EDAC amd64: MCT channel count: 2
> > [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> > [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
> > [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
> > [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
> > [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
> > [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
> > [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
> > [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
> > [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
> > [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
> > [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
> > [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
> > [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
> > [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
> > [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
> > [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
> > [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
> > 
> > DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
> > 
> > Now your change is showing 16 ranks. Still b0rked.
> > 
> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
> 
> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
> doesn't know how many ranks are filled, as the driver logic first calls it to 
> allocate for the max amount of ranks, and then fills the rank with their info 
> (or let them untouched with 0 pages, if they're empty).

Basically you're saying you're generating dimm_info structs for all
_possible_ dimms and the loop where this debug message comes from goes
and marrily initializes them all although some of them are empty:

+       for (i = 0; i < tot_dimms; i++) {
+               chan = &csi[row].channels[chn];
+               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
+                              pos[0], pos[1], pos[2]);
+               dimm->mci = mci;
+
+               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+                       i, (dimm - mci->dimms),
+                       pos[0], pos[1], pos[2], row, chn);
+
+               /* Copy DIMM location */
+               for (j = 0; j < n_layers; j++)
+                       dimm->location[j] = pos[j];
...

definitely superfluous.

Oh well, looking at edac_mc_alloc, it used to allocate structs for all
csrows on the controller even though some of them were empty...

Ok, then please remove this debug call because it is misleading. Having

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)

is enough.

You probably want to say how many channels/csrows there are, though:

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels)

or something similar. Simply dump tot_dimms, tot_channels and tot_csrows
and that's it.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ