[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c18db7e0738d4895a4893ded1e6cd99a@huawei.com>
Date: Fri, 2 Oct 2020 12:23:08 +0000
From: Shiju Jose <shiju.jose@...wei.com>
To: Borislav Petkov <bp@...en8.de>, James Morse <james.morse@....com>
CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"rjw@...ysocki.net" <rjw@...ysocki.net>,
"lenb@...nel.org" <lenb@...nel.org>, Linuxarm <linuxarm@...wei.com>
Subject: RE: [PATCH 1/1] RAS: Add CPU Correctable Error Collector to isolate
an erroneous CPU core
Hi Boris, Hi James,
>-----Original Message-----
>From: Borislav Petkov [mailto:bp@...en8.de]
>Sent: 01 October 2020 18:31
>To: James Morse <james.morse@....com>
>Cc: Shiju Jose <shiju.jose@...wei.com>; linux-edac@...r.kernel.org; linux-
>acpi@...r.kernel.org; linux-kernel@...r.kernel.org; tony.luck@...el.com;
>rjw@...ysocki.net; lenb@...nel.org; Linuxarm <linuxarm@...wei.com>
>Subject: Re: [PATCH 1/1] RAS: Add CPU Correctable Error Collector to isolate
>an erroneous CPU core
>
>On Thu, Oct 01, 2020 at 06:16:03PM +0100, James Morse wrote:
>> If the corrected-count is available somewhere, can't this policy be
>> made in user-space?
>
>You mean rasdaemon goes and offlines CPUs when certain thresholds are
>reached? Sure. It would be much more flexible too.
I will send the kernel changes for existing CEC to support the CPU CE errors.
Can you please have a look?
Thanks,
Shiju
>
>--
>Regards/Gruss,
> Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists