[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CACw+751Y=f4ARfdiPYAMaXrE0jCBGkrL-k3+XDoCHEo-6kZxzw@mail.gmail.com>
Date: Sun, 21 Dec 2025 23:35:47 +0100
From: Slackwa Slack <slackwa@...il.com>
To: netdev@...r.kernel.org
Cc: linux-usb@...r.kernel.org
Subject: [BUG] RTL8157 (0bda:8157): firmware lockup with TX SG/GSO using
Realtek r8152 out-of-tree; in-tree r8152 on 6.17.9-zen1 does not bind
Hello,
I would like to report a reproducible firmware lockup issue affecting
Realtek RTL8157 (USB 10GbE) devices using the r8152 driver.
The issue is triggered by TX scatter-gather / GSO traffic, typically
during sustained upload workloads.
Hardware
USB Ethernet adapter: Realtek RTL8157
USB ID: 0bda:8157
Host: AMD X399 platform
USB controller: xHCI (USB 3.1 Gen2, 10 Gbps)
Cable / link: USB 10Gbps negotiated correctly
Software
Kernel: 6.17.9-zen1 (also observed on other 6.x kernels)
Driver: r8152 (Realtek version v2.21.4/v2.19.2)
Distribution: Slackware-based (no vendor driver)
Problem description
Under sustained TX load (e.g. iperf3 upload, large TCP streams), when
scatter-gather / GSO is enabled, the RTL8157 firmware eventually
becomes unresponsive.
Once triggered:
TX stalls and NETDEV WATCHDOG fires
USB control transfers start timing out (-ETIMEDOUT)
The kernel attempts to reset the USB device
During reset, OCP register accesses fail, triggering WARNs
At that point the device is unrecoverable without a full USB reset.
Kernel trace (excerpt)
WARNING: CPU: 7 PID: 5220 at r8152.c:1393 ocp_word_w0w1+0xe6/0x100 [r8152]
Call Trace:
rtl_disable
rtl8153_disable
rtl8152_pre_reset
usb_reset_device
Followed by repeated USB control failures:
r8152 ... read type=0x0100, index=0xe84c fail -110
This indicates the firmware is already wedged when ocp_word_w0w1() is
called during reset handling.
Analysis
All PHY/MAC management on RTL8157 goes through OCP (On-Chip
Peripheral) access, implemented via USB control transfers.
When SG/GSO TX is enabled, the firmware appears to enter a deadlock state:
TX descriptors stop progressing
OCP engine stops responding
All further control accesses time out
The WARN in ocp_word_w0w1() is therefore a symptom, not the root cause.
The issue does not appear to be related to:
xHCI
USB bus negotiation
host memory constraints
It is reproducible only when TX segmentation / scatter-gather is enabled.
Workaround / mitigation
The issue is fully mitigated by:
Disabling SG / FRAGLIST for RTL8157
Keeping TSO enabled but with linear buffers only
In practice:
NETIF_F_SG and NETIF_F_FRAGLIST disabled
tp->sg_use = false forced for RTL8157
TSO size limited (RTL_LIMITED_TSO_SIZE)
With this configuration:
No firmware lockups observed
No OCP timeouts
Device remains stable under sustained upload
Slight TX performance reduction is acceptable
Conclusion
This appears to be a firmware bug specific to RTL8157, triggered by
scatter-gather TX paths.
A driver-side quirk disabling SG for RTL8157 would prevent firmware
lockups and avoid repeated USB resets and WARNs.
I am happy to provide:
additional logs
a minimal patch implementing the quirk
further testing if needed
Thank you for your time.
Best regards,
Powered by blists - more mailing lists