[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM8PR04MB73153260540558FF6793E63AFFE70@AM8PR04MB7315.eurprd04.prod.outlook.com>
Date: Thu, 12 Nov 2020 01:29:38 +0000
From: Andy Duan <fugang.duan@....com>
To: Kegl Rohit <keglrohit@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [EXT] Fwd: net: fec: rx descriptor ring out of order
From: Kegl Rohit <keglrohit@...il.com> Sent: Wednesday, November 11, 2020 10:27 PM
> Hello!
>
> We are using a imx6q platform.
> The fec interface is used to receive a continuous stream of custom / raw
> ethernet packets. The packet size is fixed ~132 bytes and they get sent every
> 250µs.
>
> While testing I observed spontaneous packet delays from time to time.
> After digging down deeper I think that the fec peripheral does not update the rx
> descriptor status correctly.
> I modified the queue_rx function which is called by the NAPI poll function. "no
> packet N" is printed when the queue_rx function doesn't process any descriptor.
> Therefore the variable N counts continuous calls without ready descriptors.
> When the current descriptor is ready&processed and moved to the next entry,
> then N is cleared again.
> Additionally an error is printed if the current descriptor is empty but the next
> one is already ready. In case this error happens the current descriptor and the
> next 11 ones are dumped.
> "C" ... current
> "E" ... empty
>
> [ 57.436478 < 0.020005>] no packet 1!
> [ 57.460850 < 0.024372>] no packet 1!
> [ 57.461107 < 0.000257>] ring error, current empty but next is not
> empty
> [ 57.461118 < 0.000011>] RX ahead
> [ 57.461135 < 0.000017>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461146 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.461158 < 0.000012>] 131 E 0x8840 0x2c7448c0 132
> [ 57.461170 < 0.000012>] 132 E 0x8840 0x2c745000 132
> [ 57.461181 < 0.000011>] 133 E 0x8840 0x2c745740 132
> [ 57.461192 < 0.000011>] 134 E 0x8840 0x2c745e80 132
> [ 57.461204 < 0.000012>] 135 E 0x8880 0x2c7465c0 114
> [ 57.461215 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.461227 < 0.000012>] 137 E 0x8840 0x2c747440 132
> [ 57.461239 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.461250 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.461262 < 0.000012>] 140 E 0x8840 0x2c748ec0 132
> [ 57.461477 < 0.000008>] no packet 2!
> [ 57.461506 < 0.000029>] ring error, current empty but next is not
> empty
> [ 57.461537 < 0.000031>] RX ahead
> [ 57.461550 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461563 < 0.000013>] 130 0x0840 0x2c744180 132
> [ 57.461577 < 0.000014>] 131 0x0840 0x2c7448c0 132
> [ 57.461589 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.461601 < 0.000012>] 133 E 0x8840 0x2c745740 132
> [ 57.461613 < 0.000012>] 134 E 0x8840 0x2c745e80 132
> [ 57.461624 < 0.000011>] 135 E 0x8880 0x2c7465c0 114
> [ 57.461635 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.461645 < 0.000010>] 137 E 0x8840 0x2c747440 132
> [ 57.461657 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.461668 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.461680 < 0.000012>] 140 E 0x8840 0x2c748ec0 132
> [ 57.461894 < 0.000009>] no packet 3!
> [ 57.461926 < 0.000032>] ring error, current empty but next is not
> empty
> [ 57.461935 < 0.000009>] RX ahead
> [ 57.461947 < 0.000012>] 129 C E 0x8840 0x2c743a40 132
> [ 57.461959 < 0.000012>] 130 0x0840 0x2c744180 132
> [ 57.461970 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.461982 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.461993 < 0.000011>] 133 0x0840 0x2c745740 132
> [ 57.462005 < 0.000012>] 134 E 0x8840 0x2c745e80 132
> [ 57.462017 < 0.000012>] 135 E 0x8880 0x2c7465c0 114
> [ 57.462028 < 0.000011>] 136 E 0x8840 0x2c746d00 132
> [ 57.462039 < 0.000011>] 137 E 0x8840 0x2c747440 132
> [ 57.462051 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.462062 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462075 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.462289 < 0.000009>] no packet 4!
> [ 57.462316 < 0.000027>] ring error, current empty but next is not
> empty
> [ 57.462326 < 0.000010>] RX ahead
> [ 57.462339 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462351 < 0.000012>] 130 0x0840 0x2c744180 132
> [ 57.462362 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.462373 < 0.000011>] 132 0x0840 0x2c745000 132
> [ 57.462384 < 0.000011>] 133 0x0840 0x2c745740 132
> [ 57.462397 < 0.000013>] 134 0x0840 0x2c745e80 132
> [ 57.462408 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.462421 < 0.000013>] 136 E 0x8840 0x2c746d00 132
> [ 57.462431 < 0.000010>] 137 E 0x8840 0x2c747440 132
> [ 57.462443 < 0.000012>] 138 E 0x8840 0x2c748040 132
> [ 57.462454 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462467 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.462697 < 0.000009>] no packet 5!
> [ 57.462730 < 0.000033>] ring error, current empty but next is not
> empty
> [ 57.462739 < 0.000009>] RX ahead
> [ 57.462752 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462763 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.462775 < 0.000012>] 131 0x0840 0x2c7448c0 132
> [ 57.462787 < 0.000012>] 132 0x0840 0x2c745000 132
> [ 57.462799 < 0.000012>] 133 0x0840 0x2c745740 132
> [ 57.462809 < 0.000010>] 134 0x0840 0x2c745e80 132
> [ 57.462820 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.462830 < 0.000010>] 136 0x0840 0x2c746d00 132
> [ 57.462842 < 0.000012>] 137 0x0840 0x2c747440 132
> [ 57.462853 < 0.000011>] 138 E 0x8840 0x2c748040 132
> [ 57.462864 < 0.000011>] 139 E 0x8840 0x2c748780 132
> [ 57.462877 < 0.000013>] 140 E 0x8840 0x2c748ec0 132
> [ 57.463093 < 0.000009>] no packet 6!
> [ 57.463120 < 0.000027>] RX ahead
> [ 57.463133 < 0.000013>] 129 C 0x0840 0x2c743a40 132
> [ 57.463144 < 0.000011>] 130 0x0840 0x2c744180 132
> [ 57.463155 < 0.000011>] 131 0x0840 0x2c7448c0 132
> [ 57.463166 < 0.000011>] 132 0x0840 0x2c745000 132
> [ 57.463179 < 0.000013>] 133 0x0840 0x2c745740 132
> [ 57.463190 < 0.000011>] 134 0x0840 0x2c745e80 132
> [ 57.463201 < 0.000011>] 135 0x0840 0x2c7465c0 132
> [ 57.463213 < 0.000012>] 136 0x0840 0x2c746d00 132
> [ 57.463224 < 0.000011>] 137 0x0840 0x2c747440 132
> [ 57.463235 < 0.000011>] 138 0x0840 0x2c748040 132
> [ 57.463245 < 0.000010>] 139 E 0x8840 0x2c748780 132
> [ 57.463256 < 0.000011>] 140 E 0x8840 0x2c748ec0 132
> [ 57.463695 < 0.000244>] rx 12
>
> As you can see, the described error is catched and the ring is dumped.
> 9 descriptors got ready before the current descriptor is ready.
> After that the current descriptor got ready and 12 packets were processed at
> once.
> I could also observe cases where the ring (512 entries) got full before the
> current descriptor was cleared.
> And also cases where the current and next descriptor were not ready.
> [ 57.462752 < 0.000013>] 129 C E 0x8840 0x2c743a40 132
> [ 57.462763 < 0.000011>] 130 E 0x0840 0x2c744180 132
> [ 57.462775 < 0.000012>] 131 0x0840 0x2c7448c0 132
>
> I am suspecting the errata:
>
> ERR005783 ENET: ENET Status FIFO may overflow due to consecutive short
> frames
> Description:
> When the MAC receives shorter frames (size 64 bytes) at a rate exceeding the
> average line-rate burst traffic of 400 Mbps the DMA is able to absorb, the
> receiver might drop incoming frames before a Pause frame is issued.
> Projected Impact:
> No malfunction will result aside from the frame drops.
> Workarounds:
> The application might want to implement some flow control to ensure the
> line-rate burst traffic is below 400 Mbps if it only uses consecutive small frames
> with minimal
> (96 bit times) or short
> Inter-frame gap (IFG) time following large frames at such a high rate.
> The limit does not exist for
> frames of size larger than 800 bytes.
> Proposed Solution:
> No fix scheduled
> Linux BSP Status:
> Workaround possible but not implemented in the BSP, impacting functionality as
> described above.
>
> Is the "ENET Status FIFO" some internal hardware FIFO or is it the descriptor
> ring.
> What would be the workaround when a "Workaround is possible"?
>
> I could only think of skipping/dropping the descriptor when the current is still
> busy but the next one is ready.
> But it is not easily possible because the "stuck" descriptor gets ready after a
> huge delay.
>
> Is this issue known already? Any suggestions?
>
We don't see the issue.
Yes, the IP has the errata on i.MX6Q, so the workaround is to enable HW flow control.
Keep HW flow control is enabled on your networking connection to avoid FIFO overrun happen.
Regards,
Andy
>
> Thanks in advance
Powered by blists - more mailing lists