lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 25 Apr 2019 06:20:05 +0000
From:   Sudarsana Reddy Kalluru <skalluru@...vell.com>
To:     Ian Kumlien <ian.kumlien@...il.com>
CC:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Ariel Elior <aelior@...vell.com>,
        Ameen Rahman <arahman@...vell.com>
Subject: RE: bnx2x - odd behaviour

Hi Ian,
    Thanks for the info. BCM57711E is the older version of chip. Could you please recreate with elink-debugs enabled (modprobe bnx2x debug=0x4) and provide the complete logs and the register-dump.

Thanks,
Sudarsana
> -----Original Message-----
> From: Ian Kumlien <ian.kumlien@...il.com>
> Sent: Wednesday, April 24, 2019 8:20 PM
> To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>; Ariel Elior
> <aelior@...vell.com>; Ameen Rahman <arahman@...vell.com>
> Subject: Re: bnx2x - odd behaviour
> 
> On Fri, Apr 19, 2019 at 7:23 AM Sudarsana Reddy Kalluru
> <skalluru@...vell.com> wrote:
> >
> > Hi Ian,
> >     Thanks for your info. Mfw team already analyzed the "nig timer" related
> logs but can't infer anything. From the boot-code version, the device look to
> be from the older generation of Broadcom nics. Besides the elink-
> logs/register-dump, could you also share the lspci output (lspci -vvv).
> 
> Yes, this is older machines =)
> 
> Sorry for the delay in answering, there has been a holiday here, =)
> 
> lspci output:
> 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II
> BCM57711E 10-Gigabit PCIe
>         Subsystem: Hewlett-Packard Company NC532i Dual Port 10GbE
> Multifunction BL-C Adapter
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 41
>         Region 0: Memory at fb000000 (64-bit, non-prefetchable) [size=8M]
>         Region 2: Memory at fa800000 (64-bit, non-prefetchable) [size=8M]
>         Capabilities: [48] Power Management version 3
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=1 PME-
>         Capabilities: [50] Vital Product Data
>                 Product Name: HP NC532i DP 10GbE Multifunction BL-c Adapter
>                 Read-only fields:
>                         [PN] Part number: N/A
>                         [EC] Engineering changes: N/A
>                         [SN] Serial number: 0123456789
>                         [MN] Manufacture ID: 31 34 65 34
>                         [RV] Reserved: checksum good, 39 byte(s) reserved
>                 End
>         Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [a0] MSI-X: Enable+ Count=17 Masked-
>                 Vector table: BAR=0 offset=00440000
>                 PBA: BAR=0 offset=00441800
>         Capabilities: [ac] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1
> <2us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
> FLReset- SlotPowerLimit 0.000W
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
> Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 4096 bytes
>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
> AuxPwr+ TransPend-
>                 LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency
> L0s <2us, L1 <2us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-,
> OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> OBFF Disabled
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
> EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
>         Capabilities: [100 v1] Device Serial Number 44-1e-a1-ff-fe-45-a6-38
>         Capabilities: [110 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                 UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [150 v1] Power Budgeting <?>
>         Capabilities: [160 v1] Virtual Channel
>                 Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>                 Arb:    Fixed- WRR32- WRR64- WRR128-
>                 Ctrl:   ArbSelect=Fixed
>                 Status: InProgress-
>                 VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>                         Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>                         Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>                         Status: NegoPending- InProgress-
>         Kernel driver in use: bnx2x
>         Kernel modules: bnx2x
> ---
> 
> 
> > Thanks,
> > Sudarsana
> > > -----Original Message-----
> > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > Sent: Wednesday, April 17, 2019 6:51 PM
> > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>; Ariel
> > > Elior <aelior@...vell.com>; Ameen Rahman <arahman@...vell.com>
> > > Subject: Re: bnx2x - odd behaviour
> > >
> > > On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru
> > > <skalluru@...vell.com> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > > > Sent: Wednesday, April 17, 2019 4:32 PM
> > > > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > > > Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>;
> > > > > Ariel Elior <aelior@...vell.com>; Ameen Rahman
> > > > > <arahman@...vell.com>
> > > > > Subject: Re: bnx2x - odd behaviour
> > > > >
> > > > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru
> > > > > <skalluru@...vell.com> wrote:
> > > > > >
> > > > > > +Ameen
> > > > > >
> > > > > > Ian,
> > > > > >     We couldn't find the root-cause from the logs/register-dump.
> > > > > > Could you please load the driver with link-debugs enabled,
> > > > > > i.e., modprobe
> > > > > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And
> > > > > collect the complete kernel logs and the register-dump(collected
> > > > > before performing ifconfig-down). Please also provide the output
> > > > > of "ethtool -i
> > > <interface>".
> > > > >
> > > > > I'll try, this is a production system...
> > > > >
> > > > > Could it be related to the gro changes for UDP that was done in 5.x?
> > > > >
> > > > Thanks for your help. I'm not sure if this is related to gro, link
> > > > related code
> > > is handled by different component [management firmware (mfw)]. May
> > > be the complete logs/register-dump provide some additional pointers.
> > > There were some fixes in the newer version of mfw, getting the mfw
> > > version on the chip would help (ethtool -i <interface> provides mfw/boot-
> code version).
> > >
> > > ethtool -i enp2s0f0
> > > driver: bnx2x
> > > version: 1.712.30-0 storm 7.13.1.0
> > > firmware-version: bc 6.2.28 phy baa0.105
> > > expansion-rom-version:
> > > bus-info: 0000:02:00.0
> > > supports-statistics: yes
> > > supports-test: yes
> > > supports-eeprom-access: yes
> > > supports-register-dump: yes
> > > supports-priv-flags: yes
> > >
> > > What we can see in the logs (not with the linkdebug enabled) is:
> > > apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC
> > > Link is Down apr 12 06:22:35 localhost kernel: bond0: link status
> > > down for active interface enp2s0f0, disabling it in 1000 ms apr 12
> 06:22:35 localhost kernel:
> > > bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex,
> > > Flow
> > > control: ON - transmit apr 12 06:22:35 localhost kernel: bond0: link
> > > status up again after
> > > 400 ms for interface enp2s0f0
> > > apr 12 06:22:36 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:36 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:37 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) apr 12
> > > 06:22:37 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:37 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:38 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) apr 12
> > > 06:22:38 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:38 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:39 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) apr 12
> > > 06:22:39 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:39 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:40 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) apr 12
> > > 06:22:40 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:40 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:41 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) apr 12
> > > 06:22:41 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:41 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:42 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) apr 12
> > > 06:22:42 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:42 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:43 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention
> > > 0x04000000 (masked)
> > > apr 12 06:22:43 localhost kernel: bnx2x:
> > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384
> > > apr
> > > 12 06:22:44 localhost kernel: bnx2x:
> > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7) ... and so it
> > > begins =)
> > >
> > > > > > Thanks,
> > > > > > Sudarsana
> > > > > > > -----Original Message-----
> > > > > > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > > > > > Sent: Friday, April 12, 2019 4:39 PM
> > > > > > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > > > > > Cc: Linux Kernel Network Developers
> > > > > > > <netdev@...r.kernel.org>; Ariel Elior <aelior@...vell.com>
> > > > > > > Subject: Re: bnx2x - odd behaviour
> > > > > > >
> > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru
> > > > > > > <skalluru@...vell.com> wrote:
> > > > > > > >
> > > > > > > > Hi Ian,
> > > > > > > >    Thanks for your info/help. There's not much info in the
> > > > > > > > logs (e.g., FW
> > > > > > > traces, calltraces). Will contact our firmware team on the
> > > > > > > register-dump analysis and provide you the update.
> > > > > > >
> > > > > > > Thank you =)
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Sudarsana
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Ian Kumlien <ian.kumlien@...il.com>
> > > > > > > > > Sent: Friday, April 12, 2019 2:44 PM
> > > > > > > > > To: Sudarsana Reddy Kalluru <skalluru@...vell.com>
> > > > > > > > > Cc: Linux Kernel Network Developers
> > > > > > > > > <netdev@...r.kernel.org>; Ariel Elior
> > > > > > > > > <aelior@...vell.com>
> > > > > > > > > Subject: Re: bnx2x - odd behaviour
> > > > > > > > >
> > > > > > > > > Finally!
> > > > > > > > >
> > > > > > > > > Just had a machine with the same issue!
> > > > > > > > >
> > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien
> > > > > > > > > <ian.kumlien@...il.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru
> > > > > > > > > > <skalluru@...vell.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >    We are not aware of this issue. Please collect
> > > > > > > > > > > the register dump i.e.,
> > > > > > > > > "ethtool -d <interface>" output when this issue happens
> > > > > > > > > (before performing
> > > > > > > > > link-flap) and share it for the analysis.
> > > > > > > > >
> > > > > > > > > Sent the dump separately :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ