lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Aug 2013 19:58:09 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	"Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>
Cc:	Mauro Carvalho Chehab <m.chehab@...sung.com>, tony.luck@...el.com,
	bhelgaas@...gle.com, rostedt@...dmis.org, rjw@...k.pl,
	lance.ortiz@...com, linux-pci@...r.kernel.org,
	linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error
 trace event

On Tue, Aug 13, 2013 at 11:02:08PM +0530, Naveen N. Rao wrote:
> If I'm not mistaken, even for systems that have EDAC drivers, it looks
> to me like EDAC can't really decode to the DIMM given what is provided
> by the bios in the APEI report currently. If and when ghes_edac gains
> this capability, users will have a choice between raw APEI reports vs.
> edac processed ones.

Which kinda makes that APEI tracepoint not really useful and we can call
the one we have already - trace_mc_event - from APEI...

> I started out with a simpler name, but eventually decided to use the
> name from the CPER record so it is clear what this event carries. I
> think this will be better when adding further ghes events for say,
> processor generic, PCIe and others.

This is exactly my fear: having to add a tracepoint per error type
instead of having a single trace_hw_error or so...

> >Btw 2, if GHES can report other types of errors (I'm pretty sure it can)
> >maybe we can use a single tracepoint called trace_ghes_event for any
> >types of errors coming out of it...
> 
> Two problems with this:
> - One, the record size will be really big since the cper records for
> each type of error is large.

I better go look at that CPER crap....

> - Two, it may be better to filter events based on the type of error
> (memory error, processor, pcie, ...) rather than subscribing for all
> ghes error reports.

You can filter that in userspace too.

> Do you mean conditionally print the cper records based on whether the
> tracepoint is enabled or not? Wouldn't that be confusing if someone is
> monitoring dmesg as well?

Why would you need dmesg if you get your hw errors over the tracepoint?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ