[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Nov 2020 23:10:58 +0100
From: Kegl Rohit <keglrohit@...il.com>
To: David Laight <David.Laight@...lab.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Andy Duan <fugang.duan@....com>
Subject: Re: Fwd: net: fec: rx descriptor ring out of order
On Wed, Nov 11, 2020 at 6:52 PM David Laight <David.Laight@...lab.com> wrote:
>
> > On 11/11/20 3:27 PM, Kegl Rohit wrote:
> > > Hello!
> > >
> > > We are using a imx6q platform.
> > > The fec interface is used to receive a continuous stream of custom /
> > > raw ethernet packets. The packet size is fixed ~132 bytes and they get
> > > sent every 250µs.
> > >
> > > While testing I observed spontaneous packet delays from time to time.
> > > After digging down deeper I think that the fec peripheral does not
> > > update the rx descriptor status correctly.
> > > I modified the queue_rx function which is called by the NAPI poll
> > > function. "no packet N" is printed when the queue_rx function doesn't
> > > process any descriptor.
> > > Therefore the variable N counts continuous calls without ready
> > > descriptors. When the current descriptor is ready&processed and moved
> > > to the next entry, then N is cleared again.
> > > Additionally an error is printed if the current descriptor is empty
> > > but the next one is already ready. In case this error happens the
> > > current descriptor and the next 11 ones are dumped.
> > > "C" ... current
> > > "E" ... empty
> > >
> > > [ 57.436478 < 0.020005>] no packet 1!
> > > [ 57.460850 < 0.024372>] no packet 1!
> > > [ 57.461107 < 0.000257>] ring error, current empty but next is not empty
> > > [ 57.461118 < 0.000011>] RX ahead
> > > [ 57.461135 < 0.000017>] 129 C E 0x8840 0x2c743a40 132
> > > [ 57.461146 < 0.000011>] 130 0x0840 0x2c744180 132
> > > [ 57.461158 < 0.000012>] 131 E 0x8840 0x2c7448c0 132
>
> What are the addresses of the ring entries?
> I bet there is something wrong with the cache coherency and/or
> flushing.
The ring descriptors are allocated via dma_alloc_coherent(). I will
extend the dump with their addresses.
The current output shows the dma_map_single() skb data buffer.
I tried calling flush_cache_all() before reading the descriptors
status => no change.
Are there any flush options to try?
> So the MAC hardware has done the write but (somewhere) it
> isn't visible to the cpu for ages.
It looks like that. After an error occured i will also read the skb
data (dma_sync_single() before) to check if the new data is already
there.
So I can prove that the status is wrong, because the data could be
already there.
> I've seen a 'fec' ethernet block in a freescale DSP.
> IIRC it is a fairly simple block - won't be doing out-of-order writes.
>
> The imx6q seems to be arm based.
> I'm guessing that means it doesn't do cache coherency for ethernet dma
> accesses.
> That (more or less) means the rings need to be mapped uncached.
> Any attempt to just flush/invalidate the cache lines is doomed.
The descriptors are allocated using dma_alloc_coherent(). So flushes
should not be needed? Synchronizing is done via barriers e.g. wmb()
before resetting the descriptor status.
The skb data itself is mapped using the DMA API.
> ...
> > > I am suspecting the errata:
> > >
> > > ERR005783 ENET: ENET Status FIFO may overflow due to consecutive short frames
> > > Description:
> > > When the MAC receives shorter frames (size 64 bytes) at a rate
> > > exceeding the average line-rate
> > > burst traffic of 400 Mbps the DMA is able to absorb, the receiver
> > > might drop incoming frames
> > > before a Pause frame is issued.
> > > Projected Impact:
> > > No malfunction will result aside from the frame drops.
> > > Workarounds:
> > > The application might want to implement some flow control to ensure
> > > the line-rate burst traffic is
> > > below 400 Mbps if it only uses consecutive small frames with minimal
> > > (96 bit times) or short
> > > Inter-frame gap (IFG) time following large frames at such a high rate.
> > > The limit does not exist for
> > > frames of size larger than 800 bytes.
> > > Proposed Solution:
> > > No fix scheduled
> > > Linux BSP Status:
> > > Workaround possible but not implemented in the BSP, impacting
> > > functionality as described above.
> > >
> > > Is the "ENET Status FIFO" some internal hardware FIFO or is it the
> > > descriptor ring.
> > > What would be the workaround when a "Workaround is possible"?
>
> I don't think that is applicable.
> It looks like it just drops frames under high load.
Hm ok.
> I've no idea what a 'Linux BSP' might be.
> That term is usually used for the (often broken) board support
> for things like Vx(no-longer)Works.
Hm ok.
> > > I could only think of skipping/dropping the descriptor when the
> > > current is still busy but the next one is ready.
> > > But it is not easily possible because the "stuck" descriptor gets
> > > ready after a huge delay.
>
> I bet the descriptor is at the end of a cache line which finally
> gets re-read.
Would have cache_flush_all() solved this problem?
Powered by blists - more mailing lists