[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84008955-ccae-8508-02af-dfa35d3eed90@mellanox.com>
Date: Tue, 1 Jan 2019 10:01:57 +0000
From: Eran Ben Elisha <eranbe@...lanox.com>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Jiri Pirko <jiri@...lanox.com>,
Moshe Shemesh <moshe@...lanox.com>,
Aya Levin <ayal@...lanox.com>, Tal Alon <talal@...lanox.com>,
Ariel Almog <ariela@...lanox.com>
Subject: Re: [PATCH RFC net-next 19/19] devlink: Add
Documentation/networking/devlink-health.txt
On 1/1/2019 3:47 AM, Jakub Kicinski wrote:
> On Mon, 31 Dec 2018 16:32:13 +0200, Eran Ben Elisha wrote:
>> +Once an error is reported, devlink health will do the following actions:
>> + * A log is being send to the kernel trace events buffer
>> + * Health status and statistics are being updated for the reporter instance
>> + * Object dump is being taken and saved at the reporter instance (as long as
>> + there is no other Objdump which is already stored)
>> + * Auto recovery attempt is being done. Depends on:
>> + - Auto-recovery configuration
>> + - Grace period vs. time passed since last recover
>
> Would it make sense to store the result of last recovery if it failed?
We thought about it.
Internally we discussed it and decided that recover failures shall be
indicated in the kernel logs and not be provided as part of devlink
health show command.
Keep in mind that if a recover failed, the reporter status will be kept
as is, since no recover was successfully finished.
>
Powered by blists - more mailing lists