lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 1 Dec 2020 03:35:56 +0000
From:   George Cherian <gcherian@...vell.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Sunil Kovvuri Goutham <sgoutham@...vell.com>,
        Linu Cherian <lcherian@...vell.com>,
        "Geethasowjanya Akula" <gakula@...vell.com>,
        "masahiroy@...nel.org" <masahiroy@...nel.org>,
        "willemdebruijn.kernel@...il.com" <willemdebruijn.kernel@...il.com>,
        "saeed@...nel.org" <saeed@...nel.org>,
        "jiri@...nulli.us" <jiri@...nulli.us>
Subject: Re: [PATCHv5 net-next 2/3] octeontx2-af: Add devlink health reporters
 for NPA

Hi Jakub,

> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Tuesday, December 1, 2020 7:59 AM
> To: George Cherian <gcherian@...vell.com>
> Cc: netdev@...r.kernel.org; linux-kernel@...r.kernel.org;
> davem@...emloft.net; Sunil Kovvuri Goutham <sgoutham@...vell.com>;
> Linu Cherian <lcherian@...vell.com>; Geethasowjanya Akula
> <gakula@...vell.com>; masahiroy@...nel.org;
> willemdebruijn.kernel@...il.com; saeed@...nel.org; jiri@...nulli.us
> Subject: Re: [PATCHv5 net-next 2/3] octeontx2-af: Add devlink health
> reporters for NPA
> 
> On Thu, 26 Nov 2020 19:32:50 +0530 George Cherian wrote:
> > Add health reporters for RVU NPA block.
> > NPA Health reporters handle following HW event groups
> >  - GENERAL events
> >  - ERROR events
> >  - RAS events
> >  - RVU event
> > An event counter per event is maintained in SW.
> >
> > Output:
> >  # devlink health
> >  pci/0002:01:00.0:
> >    reporter hw_npa
> >      state healthy error 0 recover 0
> >  # devlink  health dump show pci/0002:01:00.0 reporter hw_npa
> >  NPA_AF_GENERAL:
> >         Unmap PF Error: 0
> >         NIX:
> >         0: free disabled RX: 0 free disabled TX: 0
> >         1: free disabled RX: 0 free disabled TX: 0
> >         Free Disabled for SSO: 0
> >         Free Disabled for TIM: 0
> >         Free Disabled for DPI: 0
> >         Free Disabled for AURA: 0
> >         Alloc Disabled for Resvd: 0
> >   NPA_AF_ERR:
> >         Memory Fault on NPA_AQ_INST_S read: 0
> >         Memory Fault on NPA_AQ_RES_S write: 0
> >         AQ Doorbell Error: 0
> >         Poisoned data on NPA_AQ_INST_S read: 0
> >         Poisoned data on NPA_AQ_RES_S write: 0
> >         Poisoned data on HW context read: 0
> >   NPA_AF_RVU:
> >         Unmap Slot Error: 0
> 
> You seem to have missed the feedback Saeed and I gave you on v2.
> 
> Did you test this with the errors actually triggering? Devlink should store only
Yes, the same was tested using devlink health test interface by injecting errors.
The dump gets generated automatically and the counters do get out of sync, 
in case of continuous error.
That wouldn't be much of an issue as the user could manually trigger a dump clear and 
Re-dump the counters to get the exact status of the counters at any point of time.

> one dump, are the counters not going to get out of sync unless something
> clears the dump every time it triggers?

Regards,
-George

Powered by blists - more mailing lists