[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWlCs5lksxfgL6Gi@shell.armlinux.org.uk>
Date: Thu, 15 Jan 2026 19:40:35 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Tao Wang <tao03.wang@...izon.auto>
Cc: alexandre.torgue@...s.st.com, andrew+netdev@...n.ch,
davem@...emloft.net, edumazet@...gle.com, horms@...nel.org,
kuba@...nel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, maxime.chevallier@...tlin.com,
mcoquelin.stm32@...il.com, netdev@...r.kernel.org,
pabeni@...hat.com
Subject: Re: Re: [PATCH net v2] net: stmmac: fix transmit queue timed out
after resume
On Thu, Jan 15, 2026 at 12:09:18PM +0000, Russell King (Oracle) wrote:
> On Thu, Jan 15, 2026 at 03:08:53PM +0800, Tao Wang wrote:
> > > While I agree with the change for stmmac_tso_xmit(), please explain why
> > > the change in stmmac_free_tx_buffer() is necessary.
> > >
> > > It seems to me that if this is missing in stmmac_free_tx_buffer(), the
> > > driver should have more problems than just TSO.
> >
> > The change in stmmac_free_tx_buffer() is intended to be generic for all
> > users of last_segment, not only for the TSO path.
>
> However, transmit is a hotpath, so work needs to be minimised for good
> performance. We don't want anything that is unnecessary in these paths.
>
> If we always explicitly set .last_segment when adding any packet to the
> ring, then there is absolutely no need to also do so when freeing them.
>
> Also, I think there's a similar issue with .is_jumbo.
>
> So, I think it would make more sense to have some helpers for setting
> up the tx_skbuff_dma entry. Maybe something like the below? I'll see
> if I can measure the performance impact of this later today, but I
> can't guarantee I'll get to that.
>
> The idea here is to ensure that all members with the exception of
> xsk_meta are fully initialised when an entry is populated.
>
> I haven't removed anything in the tx_q->tx_skbuff_dma entry release
> path yet, but with this in place, we should be able to eliminate the
> clearance of these in stmmac_tx_clean() and stmmac_free_tx_buffer().
>
> Note that the driver assumes setting .buf to zero means the entry is
> cleared. dma_addr_t is a cookie which is device specific, and zero
> may be a valid DMA cookie. Only DMA_MAPPING_ERROR is invalid, and
> can be assumed to hold any meaning in driver code. So that needs
> fixing as well.
I've just run iperf3 in both directions with the kernel I had on the
board (based on 6.18.0-rc7-net-next+), and stmmac really isn't looking
particularly great - by that I mean, iperf3 *failed* spectacularly.
First, running in normal mode (stmmac transmitting, x86 receiving)
it's only capable of 210Mbps, which is nowhere near line rate.
However, when running iperf3 in reverse mode, it filled the stmmac's
receive queue, which then started spewing PAUSE frames at a rate of
knots, flooding the network, and causing the entire network to stop.
It never recovered without rebooting.
Trying again on 6.19.0-rc4-net-next+,
stmmac transmitting shows the same dire performance:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 24.2 MBytes 203 Mbits/sec 0 230 KBytes
[ 5] 1.00-2.00 sec 25.5 MBytes 214 Mbits/sec 0 230 KBytes
[ 5] 2.00-3.00 sec 25.0 MBytes 210 Mbits/sec 0 230 KBytes
[ 5] 3.00-4.00 sec 25.5 MBytes 214 Mbits/sec 0 230 KBytes
[ 5] 4.00-5.00 sec 25.1 MBytes 211 Mbits/sec 0 230 KBytes
[ 5] 5.00-6.00 sec 25.1 MBytes 211 Mbits/sec 0 230 KBytes
[ 5] 6.00-7.00 sec 25.7 MBytes 215 Mbits/sec 0 230 KBytes
[ 5] 7.00-8.00 sec 25.2 MBytes 212 Mbits/sec 0 230 KBytes
[ 5] 8.00-9.00 sec 25.3 MBytes 212 Mbits/sec 0 346 KBytes
[ 5] 9.00-10.00 sec 25.4 MBytes 213 Mbits/sec 0 346 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 252 MBytes 211 Mbits/sec 0 sender
[ 5] 0.00-10.02 sec 250 MBytes 210 Mbits/sec receiver
stmmac receiving shows the same problem:
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 64.1 MBytes 537 Mbits/sec
[ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec
^C[ 5] 9.00-9.43 sec 0.00 Bytes 0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-9.43 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-9.43 sec 64.1 MBytes 57.0 Mbits/sec receiver
iperf3: interrupt - the client has terminated
and it's now spewing PAUSE frames again.
The RXQ 0 debug register shows:
Value at address 0x02490d38: 0x002b0020
bits 29:16 (PRXQ = 43) is the number of packets in the RX queue
bits 5:4 (RXQSTS = 10) shows that the internal RX queue is above the
flow control activate threshold.
The RXQ 0 operating mode register shows:
Value at address 0x02490d30: 0x0ff1c4e0
bits 29:20 (RQS = 15) indicates that the receive queue size is
(255 + 1) * 256 = 65536 bytes (which is what hw feature 1 reports)
bits 16:14 (RFD = 7) indicates the threshold for deactivating flow
control
bits 10:8 (RFA = 4) indicates the threshold for activing flow control
Disabling EHFC (bit 7, enable hardware flow control) stops the flood.
Looking at the receive descriptor ring, all the entries are marked
with RDES3_OWN | RDES3_BUFFER1_VALID_ADDR - so there are free ring
entries, but the hardware is not transferring the queued packets.
Looking at the channel 0 status register, it's indicating RBU
(receive buffer unavailable.)
This gets more weird.
Channel 0 Rx descriptor tail pointer register:
Value at address 0x02491128: 0xffffee30
Channel 0 current application receive descriptor register:
Value at address 0x0249114c: 0xffffee30
Receive queue descriptor:
227 [0x0000007fffffee30]: 0xfee00040 0x7f 0x0 0x81000000
I've tried writing to the tail pointer register (both the current
value and the next descriptor value), this doesn't seem to change
anything.
I've tried clearing SR in DMA_CHAN_RX_CONTROL() and setting it,
again no change.
So, it looks like the receive hardware has permanently stalled,
needing at minimum a soft reset of the entire stmmac core to
recover it.
I think I'm going to have to declare stmmac receive on dwmac4 to
be buggy at the moment, as I can't get to the bottom of what's
causing this.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists