lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA85sZt=8_sm+x8Tjbz=G7kxZViK_ZcRt4Ywp2WZn-t6xcVcrw@mail.gmail.com>
Date:   Wed, 17 Apr 2019 15:20:48 +0200
From:   Ian Kumlien <ian.kumlien@...il.com>
To:     Sudarsana Reddy Kalluru <skalluru@...vell.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Ariel Elior <aelior@...vell.com>,
        Ameen Rahman <arahman@...vell.com>
Subject: Re: bnx2x - odd behaviour

On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru
<skalluru@...vell.com> wrote:
>
> > -----Original Message-----
> > From: Ian Kumlien <ian.kumlien@...il.com>
> > Sent: Wednesday, April 17, 2019 4:32 PM
> > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>; Ariel Elior
> > <aelior@...vell.com>; Ameen Rahman <arahman@...vell.com>
> > Subject: Re: bnx2x - odd behaviour
> >
> > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru
> > <skalluru@...vell.com> wrote:
> > >
> > > +Ameen
> > >
> > > Ian,
> > >     We couldn't find the root-cause from the logs/register-dump.
> > > Could you please load the driver with link-debugs enabled, i.e., modprobe
> > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect the
> > complete kernel logs and the register-dump(collected before performing
> > ifconfig-down). Please also provide the output of "ethtool -i <interface>".
> >
> > I'll try, this is a production system...
> >
> > Could it be related to the gro changes for UDP that was done in 5.x?
> >
> Thanks for your help. I'm not sure if this is related to gro, link related code is handled by different component [management firmware (mfw)]. May be the complete logs/register-dump provide some additional pointers. There were some fixes in the newer version of mfw, getting the mfw version on the chip would help (ethtool -i <interface> provides mfw/boot-code version).

ethtool -i enp2s0f0
driver: bnx2x
version: 1.712.30-0 storm 7.13.1.0
firmware-version: bc 6.2.28 phy baa0.105
expansion-rom-version:
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

What we can see in the logs (not with the linkdebug enabled) is:
apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down
apr 12 06:22:35 localhost kernel: bond0: link status down for active
interface enp2s0f0, disabling it in 1000 ms
apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC
Link is Up, 10000 Mbps full duplex, Flow control: ON - transmit
apr 12 06:22:35 localhost kernel: bond0: link status up again after
400 ms for interface enp2s0f0
apr 12 06:22:36 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:36 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:37 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1)
apr 12 06:22:37 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:37 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:38 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2)
apr 12 06:22:38 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:38 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:39 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3)
apr 12 06:22:39 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:39 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:40 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4)
apr 12 06:22:40 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:40 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:41 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5)
apr 12 06:22:41 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:41 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:42 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6)
apr 12 06:22:42 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:42 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:43 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
0x04000000 (masked)
apr 12 06:22:43 localhost kernel: bnx2x:
[bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
apr 12 06:22:44 localhost kernel: bnx2x:
[bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7)
... and so it begins =)

> > > Thanks,
> > > Sudarsana
> > > > -----Original Message-----
> > > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > > Sent: Friday, April 12, 2019 4:39 PM
> > > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > > Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>; Ariel
> > > > Elior <aelior@...vell.com>
> > > > Subject: Re: bnx2x - odd behaviour
> > > >
> > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru
> > > > <skalluru@...vell.com> wrote:
> > > > >
> > > > > Hi Ian,
> > > > >    Thanks for your info/help. There's not much info in the logs
> > > > > (e.g., FW
> > > > traces, calltraces). Will contact our firmware team on the
> > > > register-dump analysis and provide you the update.
> > > >
> > > > Thank you =)
> > > >
> > > > > Thanks,
> > > > > Sudarsana
> > > > > > -----Original Message-----
> > > > > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > > > > Sent: Friday, April 12, 2019 2:44 PM
> > > > > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > > > > Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>;
> > > > > > Ariel Elior <aelior@...vell.com>
> > > > > > Subject: Re: bnx2x - odd behaviour
> > > > > >
> > > > > > Finally!
> > > > > >
> > > > > > Just had a machine with the same issue!
> > > > > >
> > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien
> > > > > > <ian.kumlien@...il.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru
> > > > > > > <skalluru@...vell.com> wrote:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >    We are not aware of this issue. Please collect the
> > > > > > > > register dump i.e.,
> > > > > > "ethtool -d <interface>" output when this issue happens (before
> > > > > > performing
> > > > > > link-flap) and share it for the analysis.
> > > > > >
> > > > > > Sent the dump separately :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ