[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191122112842.tmf4lkj52hpv6tqd@rric.localdomain>
Date: Fri, 22 Nov 2019 11:28:50 +0000
From: Robert Richter <rrichter@...vell.com>
To: John Garry <john.garry@...wei.com>
CC: Borislav Petkov <bp@...en8.de>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
James Morse <james.morse@....com>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
wanghuiqiang <wanghuiqiang@...wei.com>,
Xiaofei Tan <tanxiaofei@...wei.com>,
Linuxarm <linuxarm@...wei.com>,
"Huangming (Mark)" <huangming23@...wei.com>
Subject: Re: linuxnext-2019119 edac warns (was Re: edac KASAN warning in
experimental arm64 allmodconfig boot)
On 21.11.19 15:23:42, John Garry wrote:
> On 21/11/2019 14:23, Robert Richter wrote:
> > On 21.11.19 12:34:22, John Garry wrote:
> > > [ 22.046666] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.046666] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.058311] ghes_edac: Can't register at EDAC core
> > > [ 22.065402] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.065402] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.077080] ghes_edac: Can't register at EDAC core
> > > [ 22.084140] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.084140] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.095789] ghes_edac: Can't register at EDAC core
> > > [ 22.102873] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.102873] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.115442] ghes_edac: Can't register at EDAC core
> > > [ 22.122536] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.122536] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.134344] ghes_edac: Can't register at EDAC core
> > > [ 22.141441] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.141441] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.153089] ghes_edac: Can't register at EDAC core
> > > [ 22.160161] EDAC MC: bug in low-level driver: attempt to assign
> > > [ 22.160161] duplicate mc_idx 0 in add_mc_to_global_list()
> > > [ 22.171810] ghes_edac: Can't register at EDAC core
> >
> > What I am more concerned is this here. In total this implies 8 ghes
> > users that all try to register a (single-instance) ghes mc device. For
> > non-x86 only one instance is allowed (see ghes_edac_register(), idx =
> > 0).
I also looked into this: With refcount_inc_checked() enabled, the
refcount is *not* increased from 0 to 1. Under the hood only
refcount_inc_not_zero() is called instead of refcount_inc(). So the
refcount is still zero after an edac mc device was registered. Instead
of sharing the edac mc device, the driver tries to allocate another mc
device for each GHESv2 entry in the HEST table. This causes the
'duplicate mc_idx' message. Also, it is ok to have multiple GHESv2
entries (your system seems to have 8 entries), e.g. to serve different
kind of errors in the system.
Thanks,
-Robert
Powered by blists - more mailing lists