lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190104085757.GA21274@nanopsycho.orion>
Date:   Fri, 4 Jan 2019 09:57:57 +0100
From:   Jiri Pirko <jiri@...nulli.us>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Eran Ben Elisha <eranbe@...lanox.com>, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Jiri Pirko <jiri@...lanox.com>,
        Moshe Shemesh <moshe@...lanox.com>,
        Aya Levin <ayal@...lanox.com>, Tal Alon <talal@...lanox.com>,
        Ariel Almog <ariela@...lanox.com>
Subject: Re: [PATCH RFC net-next 00/19] Devlink health reporting and recovery
 system

Tue, Jan 01, 2019 at 02:47:27AM CET, jakub.kicinski@...ronome.com wrote:
>On Mon, 31 Dec 2018 16:31:54 +0200, Eran Ben Elisha wrote:
>> The health mechanism is targeted for Real Time Alerting, in order to know when
>> something bad had happened to a PCI device
>> - Provide alert debug information
>> - Self healing
>> - If problem needs vendor support, provide a way to gather all needed debugging
>>   information.
>> 
>> The main idea is to unify and centralize driver health reports in the
>> generic devlink instance and allow the user to set different
>> attributes of the health reporting and recovery procedures.
>> 
>> The devlink health reporter:
>> Device driver creates a "health reporter" per each error/health type.
>> Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
>> or unknown (driver specific).
>> For each registered health reporter a driver can issue error/health reports
>> asynchronously. All health reports handling is done by devlink.
>> Device driver can provide specific callbacks for each "health reporter", e.g.
>>  - Recovery procedures
>>  - Diagnostics and object dump procedures
>>  - OOB initial attributes
>> Different parts of the driver can register different types of health reporters
>> with different handlers.
>> 
>> Once an error is reported, devlink health will do the following actions:
>>   * A log is being send to the kernel trace events buffer
>>   * Health status and statistics are being updated for the reporter instance
>>   * Object dump is being taken and saved at the reporter instance (as long as
>>     there is no other Objdump which is already stored)
>>   * Auto recovery attempt is being done. Depends on:
>>     - Auto-recovery configuration
>>     - Grace period vs. time passed since last recover
>> 
>> The user interface:
>> User can access/change each reporter attributes and driver specific callbacks
>> via devlink, e.g per error type (per health reporter)
>>  - Configure reporter's generic attributes (like: Disable/enable auto recovery)
>>  - Invoke recovery procedure
>>  - Run diagnostics
>>  - Object dump
>
>Thanks for continuing this work!
>
>> The devlink health interface (via netlink):
>> DEVLINK_CMD_HEALTH_REPORTER_GET
>>   Retrieves status and configuration info per DEV and reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_SET
>>   Allows reporter-related configuration setting.
>> DEVLINK_CMD_HEALTH_REPORTER_RECOVER
>>   Triggers a reporter's recovery procedure.
>> DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
>>   Retrieves diagnostics data from a reporter on a device.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_GET
>
>The addition of "objdump" and its marshalling is a bit disappointing.
>It seemed to me when region snapshots were added that they would serve
>this exact purpose.  Taking a region snapshot or "core dump", if you
>will, after something went south.  Now it's confusing to have two
>mechanisms serving the same purpose.
>
>It's unclear from quick reading of the code how if the buffers get
>timestamped.  Can you report more than one?
>
>About the marshalling, I'm not sure it belongs in the kernel.  There is
>precedent for adding interpretation of FW blobs in user space (ethtool).
>IMHO it's a more scalable approach, if we want to avoid having 100 kLoC
>drivers.  Amount of code you add to print the simple example from last
>patch is not inspiring confidence.
>
>And on the bike shedding side :) -> I think you should steer clear of
>calling this objdump, as it has very little to do with the
>functionality of well-known objdump tool.  Its not even clear what the
>object the name is referring to is.
>
>Long story short the overlap with region snapshots makes it unclear
>what purpose either serves, and IMHO we should avoid the marshalling by
>teaching user space how to interpret snapshots.  Preferably we only
>have one dump mechanism, and user space can be taught the interpretation
>once.

Unlike regions, which are binary blobs, the "objdump" we have here is a
tree of values (json like). It is not taken from FW/HW. It is generated
in driver and passed down to user to be shown. Originally, Eran just
pushed that info into a string buffer and the user had to parse it
again. I asked to format this in Netlink attributes JSON-style so the
user gets all hierarchy without need of parsing and can easily do
Netlink->human_stdout or Netlink->JSON.



>
>>   Retrieves the last stored objdump. Devlink health
>>   saves a single objdump. If an objdump is not already stored by the devlink
>>   for this reporter, devlink generates a new objdump.
>>   Objdump output is defined by the reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_CLEAR
>>   Clears the last saved objdump file for the specified reporter.
>> 
>> 
>>                                                netlink
>>                                       +--------------------------+
>>                                       |                          |
>>                                       |            +             |
>>                                       |            |             |
>>                                       +--------------------------+
>>                                                    |request for ops
>>                                                    |(diagnose,
>>  mlx5_core                             devlink     |recover,
>>                                                    |dump)
>> +--------+                            +--------------------------+
>> |        |                            |    reporter|             |
>> |        |                            |  +---------v----------+  |
>> |        |   ops execution            |  |                    |  |
>> |     <----------------------------------+                    |  |
>> |        |                            |  |                    |  |
>> |        |                            |  + ^------------------+  |
>> |        |                            |    | request for ops     |
>> |        |                            |    | (recover, dump)     |
>> |        |                            |    |                     |
>> |        |                            |  +-+------------------+  |
>> |        |     health report          |  | health handler     |  |
>> |        +------------------------------->                    |  |
>> |        |                            |  +--------------------+  |
>> |        |     health reporter create |                          |
>> |        +---------------------------->                          |
>> +--------+                            +--------------------------+
>> 
>> Available reporters:
>> In this patchset, three reporters of mlx5e driver are included. The FW
>> reporters implement devlink_health_reporter diagnostic, object dump and
>> recovery procedures. The TX reporter implements devlink_health_reporter
>> diagnostic and recovery procedures.
>> 
>> This RFC is based on the same concepts as previous RFC I sent earlier this
>> year: "[RFC PATCH iproute2-next] System specification health API". The API was
>> changed, also devlink code and mlx5e reporters were not available at the
>> previous RFC.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ