lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110413173705.GJ2791@aftab>
Date:	Wed, 13 Apr 2011 19:37:05 +0200
From:	Borislav Petkov <bp@...64.org>
To:	Prarit Bhargava <prarit@...hat.com>
Cc:	Borislav Petkov <bp@...64.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Russ Anderson <rja@....com>,
	"Luck, Tony" <tony.luck@...el.com>,
	"dzickus@...hat.com" <dzickus@...hat.com>,
	"mstowe@...hat.com" <mstowe@...hat.com>,
	"dnelson@...hat.com" <dnelson@...hat.com>,
	"rja@...ricas.sgi.com" <rja@...ricas.sgi.com>
Subject: Re: [PATCH -v2] x86, MCE: Drop default decoding notifier

On Wed, Apr 13, 2011 at 01:14:35PM -0400, Prarit Bhargava wrote:
> 
> 
> On 04/13/2011 01:01 PM, Prarit Bhargava wrote:
> >   
> >> @@ -239,7 +227,9 @@ static void print_mce(struct mce *m)
> >>  	 * Print out human-readable details about the MCE error,
> >>  	 * (if the CPU has an implementation for that)
> >>  	 */
> >> -	atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
> >> +	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
> >> +	if (ret != NOTIFY_STOP)
> >> +		pr_emerg(HW_ERR "Run the above through 'mcelog --ascii' to decode.\n");
> >>  }
> >>   
> >>     
> > Borislav,
> >
> >   
> 
> Oops.  Let me *carefully* rephrase that so it is clear what I'm
> complaining about.
> 
> > I still think you need the check for UC here.  When an UC occurs and
> > mce_panic() is called the output will include:
> >
> > [Hardware Error]:  Run the above through 'mcelog --ascii' to decode.
> >
> > potentially many, many times
> 
> for _all_ unreported *correctable* errors.
> 
> > .  The problem still is that there is no
> > output to decode (in the default case).
> >
> >   
> 
> ie) (sorry for the cut-and-paste)
> 
>         /* First print corrected ones that are still unlogged */
>         for (i = 0; i < MCE_LOG_LEN; i++) {
>                 struct mce *m = &mcelog.entry[i];
>                 if (!(m->status & MCI_STATUS_VAL))
>                         continue;
>                 if (!(m->status & MCI_STATUS_UC)) {
>                         print_mce(m);
>                         if (!apei_err)
>                                 apei_err = apei_write_mce(m);
>                 }
>         }
> 
> will potentially result in many bogus messages during a time at which we
> definitely do not want bogus messages.

I don't think that this is a problem. This is on the panic path and it
is supposed to dump only the _unreported_ CE MCEs queued in the mcelog
which can contain 32 MCEs max.

In the worst case, we will report 32 CEs before panicking. For that case
we either do printk_once as Tony suggested or we ratelimit it. I'll
update the patch.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ