[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BN7PR12MB25934DE6FCB596E0D9B8066CF8370@BN7PR12MB2593.namprd12.prod.outlook.com>
Date: Thu, 23 Aug 2018 17:53:25 +0000
From: "Ghannam, Yazen" <Yazen.Ghannam@....com>
To: Borislav Petkov <bp@...en8.de>
CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: RE: [PATCH 1/2] Revert "x86/mce/AMD: Collect error info even if valid
bits are not set"
> -----Original Message-----
> From: linux-edac-owner@...r.kernel.org <linux-edac-owner@...r.kernel.org>
> On Behalf Of Borislav Petkov
> Sent: Thursday, August 23, 2018 7:24 AM
> To: Ghannam, Yazen <Yazen.Ghannam@....com>
> Cc: linux-edac@...r.kernel.org; linux-kernel@...r.kernel.org;
> tony.luck@...el.com; x86@...nel.org
> Subject: Re: [PATCH 1/2] Revert "x86/mce/AMD: Collect error info even if valid
> bits are not set"
>
> Reviving an old issue while cleaning my inbox.
>
> On Tue, Mar 27, 2018 at 03:59:37PM +0000, Ghannam, Yazen wrote:
> > > > On Mon, Mar 26, 2018 at 07:58:51PM +0000, Ghannam, Yazen wrote:
> > > > > So at a minimum, we should always save and report as much as we can.
> > > >
> > > > Only on Zen or all AMD families?
> > > >
> > >
> > > I'll confirm with the HW folks. I understand it as a change in philosophy
> > > rather than a change in hardware.
> > >
> >
> > So this recommendation could apply to all families, but it's okay if we just
>
> Ok, so I think we should do this, still, as it is exactly what the
> recommendation says: read the MSRs even if the valid bits are not set and it
> doesn't set any Valid bits to confuse error handling downstream.
>
> This way we'll collect all possible info and then mce_amd.c should stop looking
> at the valid bits too and dump whatever has been logged.
>
> Ok?
>
Yes, this seems okay to me.
Thanks,
Yazen
Powered by blists - more mailing lists