lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 04 Nov 2020 21:08:52 -0800
From:   Saeed Mahameed <saeed@...nel.org>
To:     George Cherian <george.cherian@...vell.com>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Jiri Pirko <jiri@...dia.com>
Cc:     kuba@...nel.org, davem@...emloft.net, sgoutham@...vell.com,
        lcherian@...vell.com, gakula@...vell.com, masahiroy@...nel.org,
        willemdebruijn.kernel@...il.com
Subject: Re: [PATCH v2 net-next 3/3] octeontx2-af: Add devlink health
 reporters for NIX

On Wed, 2020-11-04 at 17:57 +0530, George Cherian wrote:
> Add health reporters for RVU NPA block.
                               ^^^ NIX ?

Cc: Jiri 

Anyway, could you please spare some words on what is NPA and what is
NIX?

Regarding the reporters names, all drivers register well known generic
names such as (fw,hw,rx,tx), I don't know if it is a good idea to use
vendor specific names, if you are reporting for hw/fw units then just
use "hw" or "fw" as the reporter name and append the unit NPA/NIX to
the counter/error names.

> Only reporter dump is supported.
> 
> Output:
>  # ./devlink health
>  pci/0002:01:00.0:
>    reporter npa
>      state healthy error 0 recover 0
>    reporter nix
>      state healthy error 0 recover 0
>  # ./devlink  health dump show pci/0002:01:00.0 reporter nix
>   NIX_AF_GENERAL:
>          Memory Fault on NIX_AQ_INST_S read: 0
>          Memory Fault on NIX_AQ_RES_S write: 0
>          AQ Doorbell error: 0
>          Rx on unmapped PF_FUNC: 0
>          Rx multicast replication error: 0
>          Memory fault on NIX_RX_MCE_S read: 0
>          Memory fault on multicast WQE read: 0
>          Memory fault on mirror WQE read: 0
>          Memory fault on mirror pkt write: 0
>          Memory fault on multicast pkt write: 0
>    NIX_AF_RAS:
>          Poisoned data on NIX_AQ_INST_S read: 0
>          Poisoned data on NIX_AQ_RES_S write: 0
>          Poisoned data on HW context read: 0
>          Poisoned data on packet read from mirror buffer: 0
>          Poisoned data on packet read from mcast buffer: 0
>          Poisoned data on WQE read from mirror buffer: 0
>          Poisoned data on WQE read from multicast buffer: 0
>          Poisoned data on NIX_RX_MCE_S read: 0
>    NIX_AF_RVU:
>          Unmap Slot Error: 0
> 

Now i am a little bit skeptic here, devlink health reporter
infrastructure was never meant to deal with dump op only, the main
purpose is to diagnose/dump and recover.

especially in your use case where you only report counters, i don't
believe devlink health dump is a proper interface for this.
Many of these counters if not most are data path packet based and maybe
they should belong to ethtool.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ