lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131204203837.GA7517@pd.tnic>
Date:	Wed, 4 Dec 2013 21:38:38 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	rui wang <ruiv.wang@...il.com>
Cc:	Lance Ortiz <lance.ortiz@...com>, bhelgaas@...gle.com,
	lance_ortiz@...mail.com, jiang.liu@...el.com, tony.luck@...el.com,
	rostedt@...dmis.org, mchehab@...hat.com,
	linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org, gong.chen@...el.com
Subject: Re: [PATCH v10 1/3] aerdrv: Trace Event for AER

On Mon, Dec 02, 2013 at 01:05:16PM +0800, rui wang wrote:
> > +	TP_printk("%s PCIe Bus Error: severity=%s, %s\n",
> > +		__get_str(dev_name),
> > +		__entry->severity == HW_EVENT_ERR_CORRECTED ? "Corrected" :
> > +			__entry->severity == HW_EVENT_ERR_FATAL ?
> > +			"Fatal" : "Uncorrected",
> > +		__entry->severity == HW_EVENT_ERR_CORRECTED ?
> > +		__print_flags(__entry->status, "|", aer_correctable_errors) :
> > +		__print_flags(__entry->status, "|", aer_uncorrectable_errors))
> > +);
> 
> This causes inconsistency between dmesg and the trace event output.
> When dmesg says "severity=Corrected", the trace event says
> "severity=Fatal". What happens is that HW_EVENT_ERR_CORRECTED is
> defined in edac.h:
> 
> enum hw_event_mc_err_type {
>         HW_EVENT_ERR_CORRECTED,
>         HW_EVENT_ERR_UNCORRECTED,
>         HW_EVENT_ERR_FATAL,
>         HW_EVENT_ERR_INFO,
> };
> 
> while aer_print_error() uses aer_error_severity_string[] defined as:
> 
> static const char *aer_error_severity_string[] = {
>         "Uncorrected (Non-Fatal)",
>         "Uncorrected (Fatal)",
>         "Corrected"
> };
> 
> In this case dmesg is correct because info->severity is assigned in
> aer_isr_one_error() using the definitions in include/linux/ras.h:
> #define AER_NONFATAL                    0
> #define AER_FATAL                       1
> #define AER_CORRECTABLE                 2
> 
> So which one is the standard? Is there a plan to unify all these names?

Yes, the AER tracepoint above should use the AER_* defines and not the
HW_EVENT_ERR_* ones which are for memory errors.

Wanna send a fix?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ