[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB6083E1000D4B267CF4271135FC792@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Fri, 26 Jan 2024 20:49:03 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: Avadhut Naik <avadhut.naik@....com>, "linux-trace-kernel@...r.kernel.org"
<linux-trace-kernel@...r.kernel.org>, "linux-edac@...r.kernel.org"
<linux-edac@...r.kernel.org>, "rostedt@...dmis.org" <rostedt@...dmis.org>,
"x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "yazen.ghannam@....com"
<yazen.ghannam@....com>, "avadnaik@....com" <avadnaik@....com>
Subject: RE: [PATCH v2 0/2] Update mce_record tracepoint
> > Is it so very different to add this to a trace record so that rasdaemon
> > can have feature parity with mcelog(8)?
>
> I knew you were gonna say that. When someone decides that it is
> a splendid idea to add more stuff to struct mce then said someone would
> want it in the tracepoint too.
>
> And then we're back to my original question:
>
> "And where does it end? Stick full dmesg in the tracepoint too?"
>
> Where do you draw the line in the sand and say, no more, especially
> static, fields bloating the trace record should be added and from then
> on, you should go collect the info from that box. Something which you're
> supposed to do anyway.
Every patch that adds new code or data structures adds to the kernel
memory footprint. Each should be considered on its merits. The basic
question being:
"Is the new functionality worth the cost?"
Where does it end? It would end if Linus declared:
"Linux is now complete. Stop sending patches".
I.e. it is never going to end.
If somebody posts a patch asking to add the full dmesg to a
tracepoint, I'll stand with you to say: "Not only no, but hell no".
So for Naik's two patches we have:
1) PPIN
Cost = 8 bytes.
Benefit: Emdeds a system identifier into the trace record so there
can be no ambiguity about which machine generated this error.
Also definitively indicates which socket on a multi-socket system.
2) MICROCODE
Cost = 4 bytes
Benefit: Certainty about the microcode version active on the core
at the time the error was detected.
RAS = Reliability, Availability, Serviceability
These changes fall into the serviceability bucket. They make it
easier to diagnose what went wrong.
-Tony
Powered by blists - more mailing lists