netdev - RE: Fwd: net: fec: rx descriptor ring out of order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 11 Nov 2020 17:51:58 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Eric Dumazet' <eric.dumazet@...il.com>,
        Kegl Rohit <keglrohit@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Andy Duan <fugang.duan@....com>
Subject: RE: Fwd: net: fec: rx descriptor ring out of order

> On 11/11/20 3:27 PM, Kegl Rohit wrote:
> > Hello!
> >
> > We are using a imx6q platform.
> > The fec interface is used to receive a continuous stream of custom /
> > raw ethernet packets. The packet size is fixed ~132 bytes and they get
> > sent every 250µs.
> >
> > While testing I observed spontaneous packet delays from time to time.
> > After digging down deeper I think that the fec peripheral does not
> > update the rx descriptor status correctly.
> > I modified the queue_rx function which is called by the NAPI poll
> > function. "no packet N" is printed when the queue_rx function doesn't
> > process any descriptor.
> > Therefore the variable N counts continuous calls without ready
> > descriptors. When the current descriptor is ready&processed and moved
> > to the next entry, then N is cleared again.
> > Additionally an error is printed if the current descriptor is empty
> > but the next one is already ready. In case this error happens the
> > current descriptor and the next 11 ones are dumped.
> > "C"  ... current
> > "E"  ... empty
> >
> > [   57.436478 <    0.020005>] no packet 1!
> > [   57.460850 <    0.024372>] no packet 1!
> > [   57.461107 <    0.000257>] ring error, current empty but next is not empty
> > [   57.461118 <    0.000011>] RX ahead
> > [   57.461135 <    0.000017>] 129 C E 0x8840 0x2c743a40  132
> > [   57.461146 <    0.000011>] 130     0x0840 0x2c744180  132
> > [   57.461158 <    0.000012>] 131   E 0x8840 0x2c7448c0  132

What are the addresses of the ring entries?
I bet there is something wrong with the cache coherency and/or
flushing.

So the MAC hardware has done the write but (somewhere) it
isn't visible to the cpu for ages.

I've seen a 'fec' ethernet block in a freescale DSP.
IIRC it is a fairly simple block - won't be doing out-of-order writes.

The imx6q seems to be arm based.
I'm guessing that means it doesn't do cache coherency for ethernet dma
accesses.
That (more or less) means the rings need to be mapped uncached.
Any attempt to just flush/invalidate the cache lines is doomed.

...
> > I am suspecting the errata:
> >
> > ERR005783 ENET: ENET Status FIFO may overflow due to consecutive short frames
> > Description:
> > When the MAC receives shorter frames (size 64 bytes) at a rate
> > exceeding the average line-rate
> > burst traffic of 400 Mbps the DMA is able to absorb, the receiver
> > might drop incoming frames
> > before a Pause frame is issued.
> > Projected Impact:
> > No malfunction will result aside from the frame drops.
> > Workarounds:
> > The application might want to implement some flow control to ensure
> > the line-rate burst traffic is
> > below 400 Mbps if it only uses consecutive small frames with minimal
> > (96 bit times) or short
> > Inter-frame gap (IFG) time following large frames at such a high rate.
> > The limit does not exist for
> > frames of size larger than 800 bytes.
> > Proposed Solution:
> > No fix scheduled
> > Linux BSP Status:
> > Workaround possible but not implemented in the BSP, impacting
> > functionality as described above.
> >
> > Is the "ENET Status FIFO" some internal hardware FIFO or is it the
> > descriptor ring.
> > What would be the workaround when a "Workaround is possible"?

I don't think that is applicable.
It looks like it just drops frames under high load.

I've no idea what a 'Linux BSP' might be.
That term is usually used for the (often broken) board support
for things like Vx(no-longer)Works.

> > I could only think of skipping/dropping the descriptor when the
> > current is still busy but the next one is ready.
> > But it is not easily possible because the "stuck" descriptor gets
> > ready after a huge delay.

I bet the descriptor is at the end of a cache line which finally
gets re-read.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)