lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 14 Aug 2013 07:43:22 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	"Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>,
	Mauro Carvalho Chehab <m.chehab@...sung.com>,
	"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
	"rostedt@...dmis.org" <rostedt@...dmis.org>,
	"rjw@...k.pl" <rjw@...k.pl>,
	"lance.ortiz@...com" <lance.ortiz@...com>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error
 trace event

On Tue, Aug 13, 2013 at 08:13:56PM +0000, Luck, Tony wrote:
> Generic tracepoints are architected to be able to fire at very high
> rates and log huge amounts of information. So we'd need something
> special to say just log these special tracepoints to network/serial.
>
> > Which reminds me, pstore could also be a good thing to use, in addition.
> > Only put error info there as it is limited anyway.
> 
> Yes - space is very limited.  I don't know how to assign priority for logging
> the dmesg data vs. some error logs.

Didn't we say at some point, "log only the panic messsage which kills
the machine"?

However, we probably could use more the messages before that
catastrophic event because they could give us hints about what lead to
the panic but in that case maybe a limited pstore is the wrong logging
medium.

Actually, I can imagine the full serial/network logs of "special"
tracepoints + dmesg to be the optimal thing.

> If we just "printk()" the most important parts - then that data will
> automatically flow to the serial console and to pstore.

Actually, does the pstore act like a circular buffer? Because if it
contains the last N relevant messages (for an arbitrary definition of
relevant) before the system dies, then that could more helpful than only
the error messages.

And with the advent of UEFI, pretty much every system has a pstore. Too
bad that we have to limit it to 50% of size so that the boxes don't
brick. :-P

> Then we have multiple paths for the critical bits of the error log
> - and the tracepoints give us more details for the cases where the
> machine doesn't spontaneously explode.

Ok, let's sort:

* First we have the not-so-critical hw error messages. We want to carry
those out-of-band, i.e. not in dmesg so that people don't have to parse
and collect dmesg but have a specialized solution which gives them
structured logs and tools can analyze, collect and ... those errors.

* When a critical error happens, the above usage is not necessarily
advantageous anymore in the sense that, in order to debug what caused
the machine to crash, we don't simply necessarily want only the crash
message but also the whole system activity that lead to it.

In which case, we probably actually want to turn off/ignore the error
logging tracepoints and write *only* to dmesg which goes out over serial
and to pstore. Right?

Because in such cases I want to have *all* *relevant* messages that lead
to the explosion + the explosion message itself.

Makes sense? Yes, no? Aspects I've missed?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ