lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 7 Feb 2022 19:04:23 +0100 From: Thomas Kupper <thomas@...per.org> To: Shyam Sundar S K <Shyam-sundar.S-k@....com>, Tom Lendacky <thomas.lendacky@....com> Cc: netdev@...r.kernel.org Subject: Re: AMD XGBE "phy irq request failed" kernel v5.17-rc2 on V1500B based board Am 07.02.22 um 16:19 schrieb Shyam Sundar S K: > > On 2/7/2022 8:02 PM, Tom Lendacky wrote: >> On 2/5/22 12:14, Thomas Kupper wrote: >>> Am 05.02.22 um 16:51 schrieb Tom Lendacky: >>>> On 2/5/22 04:06, Thomas Kupper wrote: >>>> Reloading the module and specify the dyndbg option to get some >>>> additional debug output. >>>> >>>> I'm adding Shyam to the thread, too, as I'm not familiar with the >>>> configuration for this chip. >>>> >>> Right after boot: >>> >>> [ 5.352977] amd-xgbe 0000:06:00.1 eth0: net device enabled >>> [ 5.354198] amd-xgbe 0000:06:00.2 eth1: net device enabled >>> ... >>> [ 5.382185] amd-xgbe 0000:06:00.1 enp6s0f1: renamed from eth0 >>> [ 5.426931] amd-xgbe 0000:06:00.2 enp6s0f2: renamed from eth1 >>> ... >>> [ 9.701637] amd-xgbe 0000:06:00.2 enp6s0f2: phy powered off >>> [ 9.701679] amd-xgbe 0000:06:00.2 enp6s0f2: CL73 AN disabled >>> [ 9.701715] amd-xgbe 0000:06:00.2 enp6s0f2: CL37 AN disabled >>> [ 9.738191] amd-xgbe 0000:06:00.2 enp6s0f2: starting PHY >>> [ 9.738219] amd-xgbe 0000:06:00.2 enp6s0f2: starting I2C >>> ... >>> [ 10.742622] amd-xgbe 0000:06:00.2 enp6s0f2: firmware mailbox >>> command did not complete >>> [ 10.742710] amd-xgbe 0000:06:00.2 enp6s0f2: firmware mailbox reset >>> performed >>> [ 10.750813] amd-xgbe 0000:06:00.2 enp6s0f2: 10GbE SFI mode set >>> [ 10.768366] amd-xgbe 0000:06:00.2 enp6s0f2: 10GbE SFI mode set >>> [ 10.768371] amd-xgbe 0000:06:00.2 enp6s0f2: fixed PHY configuration >>> >>> Then after 'ifconfig enp6s0f2 up': >>> >>> [ 189.184928] amd-xgbe 0000:06:00.2 enp6s0f2: phy powered off >>> [ 189.191828] amd-xgbe 0000:06:00.2 enp6s0f2: 10GbE SFI mode set >>> [ 189.191863] amd-xgbe 0000:06:00.2 enp6s0f2: CL73 AN disabled >>> [ 189.191894] amd-xgbe 0000:06:00.2 enp6s0f2: CL37 AN disabled >>> [ 189.196338] amd-xgbe 0000:06:00.2 enp6s0f2: starting PHY >>> [ 189.198792] amd-xgbe 0000:06:00.2 enp6s0f2: 10GbE SFI mode set >>> [ 189.212036] genirq: Flags mismatch irq 69. 00000000 (enp6s0f2-pcs) >>> vs. 00000000 (enp6s0f2-pcs) >>> [ 189.221700] amd-xgbe 0000:06:00.2 enp6s0f2: phy irq request failed >>> [ 189.231051] amd-xgbe 0000:06:00.2 enp6s0f2: phy powered off >>> [ 189.231054] amd-xgbe 0000:06:00.2 enp6s0f2: stopping I2C >>> >> Please ensure that the ethtool msglvl is on for drv and probe. I was >> expecting to see some additional debug messages that I don't see here. >> >> Also, if you can provide the lspci output for the device (using -nn and >> -vv) that might be helpful as well. >> >> Shyam will be the best one to understand what is going on here. > On some other platforms, we have seen similar kind of problems getting > reported. There is a fix sent for validation. > > The root cause is that removal of xgbe driver is causing interrupt storm > on the MP2 device (Sensor Fusion Hub). > > Shall submit a fix soon to upstream once the validation is done, you may > give it a try with that and see if that helps. > > Thanks, > Shyam > >> Thanks, >> Tom Sorry, forgot the 'lspci -nn -vv' output. Here it goes: $ ethtool -i enp6s0f2 driver: amd-xgbe version: 5.17.0-rc2-tk firmware-version: 17.118.33 expansion-rom-version: bus-info: 0000:06:00.2 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no $ lspci -nn -vv -s 0:6:0.2 06:00.2 Ethernet controller [0200]: Advanced Micro Devices, Inc. [AMD] Device [1022:1458] Subsystem: Advanced Micro Devices, Inc. [AMD] Device [1022:1458] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin C routed to IRQ 69 Region 0: Memory at d0020000 (32-bit, non-prefetchable) [size=128K] Region 1: Memory at d0000000 (32-bit, non-prefetchable) [size=128K] Region 2: Memory at d0080000 (64-bit, non-prefetchable) [size=8K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [64] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x16 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [a0] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [c0] MSI-X: Enable+ Count=7 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=2 offset=00001000 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [2a0 v1] Access Control Services ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Kernel driver in use: amd-xgbe Kernel modules: amd_xgbe /Thomas
Powered by blists - more mailing lists