[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26f99f46a3e045889b96b147207905e6@huawei.com>
Date: Wed, 8 Apr 2020 09:20:51 +0000
From: Shiju Jose <shiju.jose@...wei.com>
To: Borislav Petkov <bp@...en8.de>
CC: "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rjw@...ysocki.net" <rjw@...ysocki.net>,
"helgaas@...nel.org" <helgaas@...nel.org>,
"lenb@...nel.org" <lenb@...nel.org>,
"james.morse@....com" <james.morse@....com>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"zhangliguang@...ux.alibaba.com" <zhangliguang@...ux.alibaba.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
Linuxarm <linuxarm@...wei.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
tanxiaofei <tanxiaofei@...wei.com>,
yangyicong <yangyicong@...wei.com>
Subject: RE: [PATCH v6 1/2] ACPI / APEI: Add support to notify the vendor
specific HW errors
Hi Boris,
>-----Original Message-----
>From: Borislav Petkov [mailto:bp@...en8.de]
>Sent: 31 March 2020 10:09
>To: Shiju Jose <shiju.jose@...wei.com>
>Cc: linux-acpi@...r.kernel.org; linux-pci@...r.kernel.org; linux-
>kernel@...r.kernel.org; rjw@...ysocki.net; helgaas@...nel.org;
>lenb@...nel.org; james.morse@....com; tony.luck@...el.com;
>gregkh@...uxfoundation.org; zhangliguang@...ux.alibaba.com;
>tglx@...utronix.de; Linuxarm <linuxarm@...wei.com>; Jonathan Cameron
><jonathan.cameron@...wei.com>; tanxiaofei <tanxiaofei@...wei.com>;
>yangyicong <yangyicong@...wei.com>
>Subject: Re: [PATCH v6 1/2] ACPI / APEI: Add support to notify the vendor
>specific HW errors
>
>On Mon, Mar 30, 2020 at 03:44:29PM +0000, Shiju Jose wrote:
>> 1. rasdaemon need not to print the vendor error data reported by the
>firmware if the
>> kernel driver already print those information. In this case rasdaemon will
>only need to store
>> the decoded vendor error data to the SQL database.
>
>Well, there's a problem with this:
>
>rasdaemon printing != kernel driver printing
>
>Because printing in dmesg would need people to go grep dmesg.
>
>Printing through rasdaemon or any userspace agent, OTOH, is a lot more
>flexible wrt analyzing and collecting those error records. Especially if you are a
>data center admin and you want to collect all your error
>records: grepping dmesg simply doesn't scale versus all the rasdaemon
>agents reporting to a centrallized location.
Ok.
I posted V7 of this series.
"[v7 PATCH 0/6] ACPI / APEI: Add support to notify non-fatal HW errors"
>
>> 2. If the vendor kernel driver want to report extra error information
>through
>> the vendor specific data (though presently we do not have any such use
>case) for the rasdamon to log.
>> I think the error handled status useful to indicate that the kernel driver
>has filled the extra information and
>> rasdaemon to decode and log them after extra data specific validity
>check.
>
>The kernel driver can report that extra information without the kernel saying
>that the error was handled.
>
>So I still see no sense for the kernel to tell userspace explicitly that it handled
>the error. There might be a valid reason, though, of which I cannot think of
>right now.
Ok.
>
>Thx.
>
>--
>Regards/Gruss,
> Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette
Thanks,
Shiju
Powered by blists - more mailing lists