[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <663e9bd4c2525_db82d29451@dwillia2-xfh.jf.intel.com.notmuch>
Date: Fri, 10 May 2024 15:12:36 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: "Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>, "Borislav
Petkov" <bp@...en8.de>
CC: "Rafael J. Wysocki" <rafael@...nel.org>, Len Brown <lenb@...nel.org>, Tony
Luck <tony.luck@...el.com>, <linux-acpi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-edac@...r.kernel.org>, Dan Williams
<dan.j.williams@...el.com>
Subject: Re: [RFC PATCH v2 3/3] ACPI: extlog: Make print_extlog_rcd() log
unconditionally
Fabio M. De Francesco wrote:
> On Friday, May 10, 2024 9:25:56 PM GMT+2 Borislav Petkov wrote:
> > On Fri, May 10, 2024 at 09:00:34PM +0200, Fabio M. De Francesco wrote:
> > > I thought that ELOG and GHES should be modeled consistently. ghes_proc()
> > > prints to the console while ghes_do_proc() also uses ftrace.
> >
> > ghes_proc() calls ghes_do_proc(). I have no clue what you mean here.
> >
>
> My understanding is that ghes_proc() logs to the console and ghes_do_proc()
> calls the tracers.
>
> Therefore, GHES at the same time always reports the errors via two different
> means.
>
> Instead ELOG depends on the check on ras_userspace_consumers() to decide
> whether to call print_extlog_rcd() to print the logs. And if it print to the
> kernel logs, it jumps to "out" and skips the tracers.
>
> Why is it different with respect to how error reporting is made in GHES?
>
> I thought that ELOG should be modeled similarly to GHES and so it should print
> to the kernel logs always unconditionally and then it should also use ftrace
> (no goto "out" and skip tracers).
>
> (1) Is my understanding of logging and tracing in ELOG and GHES correct?
> (2) If it is, does it make sense for ELOG to print to the kernel log,
> unconditionally, and then call the tracers like ghes_proc() + ghes_do_proc()
> do?
I had asked Fabio to take a look at whether it made sense to continue
with the concept of ras_userspace_consumers() especially since it seems
limited to the EXTLOG case.
In general I am finding that between OS Native and Firmware First error
reporting the logging approaches are inconsistent.
As far I can see rasdaemon would not even notice is the "daemon_active"
debugfs file went away [1], and it should be the case that the
tracepoints always fire whether daemon_active is open or not.
So I was expecting this removal to be a conversation starter on the
wider topic of error reporting consistency.
[1]: https://github.com/mchehab/rasdaemon/blob/master/ras-events.c#L992
Powered by blists - more mailing lists