[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6911a79a-bcd7-03e1-1c90-2adb88aaa1db@amazon.com>
Date: Wed, 12 Jun 2019 15:35:31 +0300
From: "Hawa, Hanna" <hhhawa@...zon.com>
To: Borislav Petkov <bp@...en8.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>
CC: Mauro Carvalho Chehab <mchehab@...nel.org>,
James Morse <james.morse@....com>,
"robh+dt@...nel.org" <robh+dt@...nel.org>,
"Woodhouse, David" <dwmw@...zon.co.uk>,
"paulmck@...ux.ibm.com" <paulmck@...ux.ibm.com>,
"mark.rutland@....com" <mark.rutland@....com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"nicolas.ferre@...rochip.com" <nicolas.ferre@...rochip.com>,
"devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
"Shenhar, Talel" <talel@...zon.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Chocron, Jonathan" <jonnyc@...zon.com>,
"Krupnik, Ronen" <ronenk@...zon.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"Hanoch, Uri" <hanochu@...zon.com>
Subject: Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC
Hi Boris,
>
> Yap, I think we're in agreement here. I believe the important question
> is whether you need to get error information from multiple sources
> together in order to do proper recovery or doing it per error source
> suffices.
>
> And I think the actual use cases could/should dictate our
> drivers/orchestrators design.
>
> Thus my question how you guys are planning on tying all that error info
> the drivers report, into the whole system design?
We have daemon script that collects correctable/uncorrectable errors
from EDAC sysfs and reports to Amazon service that allow us to take
action on specific error thresholds.
Thanks,
Hanna
>
Powered by blists - more mailing lists