[<prev] [next>] [day] [month] [year] [list]
Message-ID: <aa06a22e-7056-4ac9-8830-fd05c85250e5@gmail.com>
Date: Sun, 26 Nov 2023 11:35:22 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: Gregor Mlakar <turok256@...il.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Linux kernel 6.6.2: Dragon RTL8125BG network card stopped working
On 26.11.2023 02:46, Gregor Mlakar wrote:
> Hello,
>
> network card (Dragon RTL8125BG) on my motherboard (B650E Steel Legend WiFi) has stopped working on Arch Linux distribution with linux kernel 6.6.2 (both normal and zen kernel). If I revert back to kernel 6.6.1 it works fine. When I try to reboot, the PC gets stuck at line saying "watchdog did not stop!".
>
> Motherboard:
> https://www.asrock.com/mb/AMD/B650E%20Steel%20Legend%20WiFi/index.asp#Specification <https://www.asrock.com/mb/AMD/B650E%20Steel%20Legend%20WiFi/index.asp#Specification>
>
> dmesg (the last part with call trace keeps repeating every 122s):
>
> [ 7.612105] r8169 0000:09:00.0 eth0: RTL8125B, xx:xx:xx:xx:xx:xx, XID 641, IRQ 116
> [ 7.612109] r8169 0000:09:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
> [ 7.659150] r8169 0000:09:00.0 enp9s0: renamed from eth0
> [ 7.708638] cryptd: max_cpu_qlen set to 1000
> [ 7.726830] Bluetooth: Core ver 2.22
> [ 7.726844] NET: Registered PF_BLUETOOTH protocol family
> [ 7.726846] Bluetooth: HCI device and connection manager initialized
> [ 7.726848] Bluetooth: HCI socket layer initialized
> [ 7.726850] Bluetooth: L2CAP socket layer initialized
> [ 7.726853] Bluetooth: SCO socket layer initialized
> [ 7.726939] mc: Linux media interface: v0.10
> [ 7.730916] AVX2 version of gcm_enc/dec engaged.
> [ 7.730959] AES CTR mode by8 optimization enabled
> [ 7.741154] usbcore: registered new interface driver btusb
> [ 7.752863] Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: xxxxxxxxxxxxxx
> [ 7.829804] kvm_amd: TSC scaling supported
> [ 7.829806] kvm_amd: Nested Virtualization enabled
> [ 7.829807] kvm_amd: Nested Paging enabled
> [ 7.829813] kvm_amd: Virtual VMLOAD VMSAVE supported
> [ 7.829813] kvm_amd: Virtual GIF supported
> [ 7.829814] kvm_amd: Virtual NMI enabled
> [ 7.829814] kvm_amd: LBR virtualization supported
> [ 7.837383] MCE: In-kernel MCE decoding enabled.
> [ 7.925523] intel_rapl_common: Found RAPL domain package
> [ 7.925525] intel_rapl_common: Found RAPL domain core
> [ 8.164594] usbcore: registered new interface driver snd-usb-audio
> [ 8.274455] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [ 8.274596] Loaded X.509 cert 'sforshee: xxxxxxxxxxxxxxxxxx'
> [ 8.274694] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
> [ 8.274697] cfg80211: failed to load regulatory.db
> [ 8.310577] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-900:00: attached PHY driver (mii_bus:phy_addr=r8169-0-900:00, irq=MAC)
> [ 29.331343] Bluetooth: hci0: Device setup in 21084167 usecs
> [ 29.331347] Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.
> [ 29.604845] Bluetooth: hci0: AOSP extensions version v1.00
> [ 29.604847] Bluetooth: hci0: AOSP quality report is supported
> [ 198.084608] firefox[969]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set
> [ 245.487028] INFO: task kworker/u66:4:261 blocked for more than 122 seconds.
> [ 245.487033] Not tainted 6.6.2-arch1-1 #1
> [ 245.487034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 245.487035] task:kworker/u66:4 state:D stack:0 pid:261 ppid:2 flags:0x00004000
> [ 245.487039] Workqueue: events_power_efficient phy_state_machine [libphy]
> [ 245.487051] Call Trace:
> [ 245.487052] <TASK>
> [ 245.487054] __schedule+0x3e8/0x1410
> [ 245.487058] ? sysvec_apic_timer_interrupt+0xe/0x90
> [ 245.487063] schedule+0x5e/0xd0
> [ 245.487065] schedule_preempt_disabled+0x15/0x30
> [ 245.487067] __mutex_lock.constprop.0+0x39a/0x6a0
> [ 245.487071] phy_start_aneg+0x1d/0x40 [libphy 93248cd1d88abf54f1b4cc64a990177f549a7710]
> [ 245.487081] rtl_reset_work+0x1bd/0x3b0 [r8169 08653ab60f23923c3943d53f140b2b697e265b93]
> [ 245.487087] r8169_phylink_handler+0x5b/0x240 [r8169 08653ab60f23923c3943d53f140b2b697e265b93]
> [ 245.487091] phy_link_change+0x2e/0x60 [libphy 93248cd1d88abf54f1b4cc64a990177f549a7710]
> [ 245.487101] phy_check_link_status+0xad/0xe0 [libphy 93248cd1d88abf54f1b4cc64a990177f549a7710]
> [ 245.487110] phy_state_machine+0x80/0x2c0 [libphy 93248cd1d88abf54f1b4cc64a990177f549a7710]
> [ 245.487119] process_one_work+0x171/0x340
> [ 245.487123] worker_thread+0x27b/0x3a0
> [ 245.487125] ? __pfx_worker_thread+0x10/0x10
> [ 245.487126] kthread+0xe5/0x120
> [ 245.487129] ? __pfx_kthread+0x10/0x10
> [ 245.487131] ret_from_fork+0x31/0x50
> [ 245.487134] ? __pfx_kthread+0x10/0x10
> [ 245.487135] ret_from_fork_asm+0x1b/0x30
> [ 245.487141] </TASK>
>
>
> lspci:
>
> 09:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
> Subsystem: ASRock Incorporation RTL8125 2.5GbE Controller
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 40
> IOMMU group: 1
> Region 0: I/O ports at e000 [size=256]
> Region 2: Memory at fca00000 (64-bit, non-prefetchable) [size=64K]
> Region 4: Memory at fca10000 (64-bit, non-prefetchable) [size=16K]
> Capabilities: [40] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Address: 0000000000000000 Data: 0000
> Masking: 00000000 Pending: 00000000
> Capabilities: [70] Express (v2) Endpoint, MSI 01
> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 26W
> DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> MaxPayload 256 bytes, MaxReadReq 4096 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
> LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s, Width x1
> TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
> 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix-
> EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
> FRS- TPHComp+ ExtTPHComp-
> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ 10BitTagReq- OBFF Disabled,
> AtomicOpsCtl: ReqEn-
> LnkCap2: Supported Link Speeds: 2.5-5GT/s, Crosslink- Retimer- 2Retimers- DRS-
> LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
> LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
> EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
> Retimer- 2Retimers- CrosslinkRes: unsupported
> Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
> Vector table: BAR=4 offset=00000000
> PBA: BAR=4 offset=00000800
> Capabilities: [d0] Vital Product Data
> pcilib: sysfs_read_vpd: read failed: No such device
> Not readable
> Capabilities: [100 v2] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
> MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> HeaderLog: 00000000 00000000 00000000 00000000
> Capabilities: [148 v1] Virtual Channel
> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb: Fixed- WRR32- WRR64- WRR128-
> Ctrl: ArbSelect=Fixed
> Status: InProgress-
> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> Status: NegoPending- InProgress-
> Capabilities: [168 v1] Device Serial Number xx-xx-xx-xx-xx-xx-xx-xx
> Capabilities: [178 v1] Transaction Processing Hints
> No steering table available
> Capabilities: [204 v1] Latency Tolerance Reporting
> Max snoop latency: 0ns
> Max no snoop latency: 0ns
> Capabilities: [20c v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
> T_CommonMode=0us LTR1.2_Threshold=306176ns
> L1SubCtl2: T_PwrOn=150us
> Capabilities: [21c v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
> Kernel driver in use: r8169
> Kernel modules: r8169
>
>
> Best regards,
> Gregor Mlakar
Thanks for the report. A very similar, or even same, issue has been reported already.
Are you using a jumbo mtu?
Could you please test whether the following fixes the issue for you?
---
drivers/net/ethernet/realtek/r8169_main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 0aed99a20..e32cc3279 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -575,6 +575,7 @@ struct rtl8169_tc_offsets {
enum rtl_flag {
RTL_FLAG_TASK_ENABLED = 0,
RTL_FLAG_TASK_RESET_PENDING,
+ RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE,
RTL_FLAG_TASK_TX_TIMEOUT,
RTL_FLAG_MAX
};
@@ -4494,6 +4495,8 @@ static void rtl_task(struct work_struct *work)
reset:
rtl_reset_work(tp);
netif_wake_queue(tp->dev);
+ } else if (test_and_clear_bit(RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE, tp->wk.flags)) {
+ rtl_reset_work(tp);
}
out_unlock:
rtnl_unlock();
@@ -4527,7 +4530,7 @@ static void r8169_phylink_handler(struct net_device *ndev)
} else {
/* In few cases rx is broken after link-down otherwise */
if (rtl_is_8125(tp))
- rtl_reset_work(tp);
+ rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE);
pm_runtime_idle(d);
}
@@ -4603,7 +4606,7 @@ static int rtl8169_close(struct net_device *dev)
rtl8169_down(tp);
rtl8169_rx_clear(tp);
- cancel_work_sync(&tp->wk.work);
+ cancel_work(&tp->wk.work);
free_irq(tp->irq, tp);
--
2.43.0
Powered by blists - more mailing lists