lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM8PR04MB73153260540558FF6793E63AFFE70@AM8PR04MB7315.eurprd04.prod.outlook.com>
Date:   Thu, 12 Nov 2020 01:29:38 +0000
From:   Andy Duan <fugang.duan@....com>
To:     Kegl Rohit <keglrohit@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [EXT] Fwd: net: fec: rx descriptor ring out of order

From: Kegl Rohit <keglrohit@...il.com> Sent: Wednesday, November 11, 2020 10:27 PM
> Hello!
> 
> We are using a imx6q platform.
> The fec interface is used to receive a continuous stream of custom / raw
> ethernet packets. The packet size is fixed ~132 bytes and they get sent every
> 250µs.
> 
> While testing I observed spontaneous packet delays from time to time.
> After digging down deeper I think that the fec peripheral does not update the rx
> descriptor status correctly.
> I modified the queue_rx function which is called by the NAPI poll function. "no
> packet N" is printed when the queue_rx function doesn't process any descriptor.
> Therefore the variable N counts continuous calls without ready descriptors.
> When the current descriptor is ready&processed and moved to the next entry,
> then N is cleared again.
> Additionally an error is printed if the current descriptor is empty but the next
> one is already ready. In case this error happens the current descriptor and the
> next 11 ones are dumped.
> "C"  ... current
> "E"  ... empty
> 
> [   57.436478 <    0.020005>] no packet 1!
> [   57.460850 <    0.024372>] no packet 1!
> [   57.461107 <    0.000257>] ring error, current empty but next is not
> empty
> [   57.461118 <    0.000011>] RX ahead
> [   57.461135 <    0.000017>] 129 C E 0x8840 0x2c743a40  132
> [   57.461146 <    0.000011>] 130     0x0840 0x2c744180  132
> [   57.461158 <    0.000012>] 131   E 0x8840 0x2c7448c0  132
> [   57.461170 <    0.000012>] 132   E 0x8840 0x2c745000  132
> [   57.461181 <    0.000011>] 133   E 0x8840 0x2c745740  132
> [   57.461192 <    0.000011>] 134   E 0x8840 0x2c745e80  132
> [   57.461204 <    0.000012>] 135   E 0x8880 0x2c7465c0  114
> [   57.461215 <    0.000011>] 136   E 0x8840 0x2c746d00  132
> [   57.461227 <    0.000012>] 137   E 0x8840 0x2c747440  132
> [   57.461239 <    0.000012>] 138   E 0x8840 0x2c748040  132
> [   57.461250 <    0.000011>] 139   E 0x8840 0x2c748780  132
> [   57.461262 <    0.000012>] 140   E 0x8840 0x2c748ec0  132
> [   57.461477 <    0.000008>] no packet 2!
> [   57.461506 <    0.000029>] ring error, current empty but next is not
> empty
> [   57.461537 <    0.000031>] RX ahead
> [   57.461550 <    0.000013>] 129 C E 0x8840 0x2c743a40  132
> [   57.461563 <    0.000013>] 130     0x0840 0x2c744180  132
> [   57.461577 <    0.000014>] 131     0x0840 0x2c7448c0  132
> [   57.461589 <    0.000012>] 132     0x0840 0x2c745000  132
> [   57.461601 <    0.000012>] 133   E 0x8840 0x2c745740  132
> [   57.461613 <    0.000012>] 134   E 0x8840 0x2c745e80  132
> [   57.461624 <    0.000011>] 135   E 0x8880 0x2c7465c0  114
> [   57.461635 <    0.000011>] 136   E 0x8840 0x2c746d00  132
> [   57.461645 <    0.000010>] 137   E 0x8840 0x2c747440  132
> [   57.461657 <    0.000012>] 138   E 0x8840 0x2c748040  132
> [   57.461668 <    0.000011>] 139   E 0x8840 0x2c748780  132
> [   57.461680 <    0.000012>] 140   E 0x8840 0x2c748ec0  132
> [   57.461894 <    0.000009>] no packet 3!
> [   57.461926 <    0.000032>] ring error, current empty but next is not
> empty
> [   57.461935 <    0.000009>] RX ahead
> [   57.461947 <    0.000012>] 129 C E 0x8840 0x2c743a40  132
> [   57.461959 <    0.000012>] 130     0x0840 0x2c744180  132
> [   57.461970 <    0.000011>] 131     0x0840 0x2c7448c0  132
> [   57.461982 <    0.000012>] 132     0x0840 0x2c745000  132
> [   57.461993 <    0.000011>] 133     0x0840 0x2c745740  132
> [   57.462005 <    0.000012>] 134   E 0x8840 0x2c745e80  132
> [   57.462017 <    0.000012>] 135   E 0x8880 0x2c7465c0  114
> [   57.462028 <    0.000011>] 136   E 0x8840 0x2c746d00  132
> [   57.462039 <    0.000011>] 137   E 0x8840 0x2c747440  132
> [   57.462051 <    0.000012>] 138   E 0x8840 0x2c748040  132
> [   57.462062 <    0.000011>] 139   E 0x8840 0x2c748780  132
> [   57.462075 <    0.000013>] 140   E 0x8840 0x2c748ec0  132
> [   57.462289 <    0.000009>] no packet 4!
> [   57.462316 <    0.000027>] ring error, current empty but next is not
> empty
> [   57.462326 <    0.000010>] RX ahead
> [   57.462339 <    0.000013>] 129 C E 0x8840 0x2c743a40  132
> [   57.462351 <    0.000012>] 130     0x0840 0x2c744180  132
> [   57.462362 <    0.000011>] 131     0x0840 0x2c7448c0  132
> [   57.462373 <    0.000011>] 132     0x0840 0x2c745000  132
> [   57.462384 <    0.000011>] 133     0x0840 0x2c745740  132
> [   57.462397 <    0.000013>] 134     0x0840 0x2c745e80  132
> [   57.462408 <    0.000011>] 135     0x0840 0x2c7465c0  132
> [   57.462421 <    0.000013>] 136   E 0x8840 0x2c746d00  132
> [   57.462431 <    0.000010>] 137   E 0x8840 0x2c747440  132
> [   57.462443 <    0.000012>] 138   E 0x8840 0x2c748040  132
> [   57.462454 <    0.000011>] 139   E 0x8840 0x2c748780  132
> [   57.462467 <    0.000013>] 140   E 0x8840 0x2c748ec0  132
> [   57.462697 <    0.000009>] no packet 5!
> [   57.462730 <    0.000033>] ring error, current empty but next is not
> empty
> [   57.462739 <    0.000009>] RX ahead
> [   57.462752 <    0.000013>] 129 C E 0x8840 0x2c743a40  132
> [   57.462763 <    0.000011>] 130     0x0840 0x2c744180  132
> [   57.462775 <    0.000012>] 131     0x0840 0x2c7448c0  132
> [   57.462787 <    0.000012>] 132     0x0840 0x2c745000  132
> [   57.462799 <    0.000012>] 133     0x0840 0x2c745740  132
> [   57.462809 <    0.000010>] 134     0x0840 0x2c745e80  132
> [   57.462820 <    0.000011>] 135     0x0840 0x2c7465c0  132
> [   57.462830 <    0.000010>] 136     0x0840 0x2c746d00  132
> [   57.462842 <    0.000012>] 137     0x0840 0x2c747440  132
> [   57.462853 <    0.000011>] 138   E 0x8840 0x2c748040  132
> [   57.462864 <    0.000011>] 139   E 0x8840 0x2c748780  132
> [   57.462877 <    0.000013>] 140   E 0x8840 0x2c748ec0  132
> [   57.463093 <    0.000009>] no packet 6!
> [   57.463120 <    0.000027>] RX ahead
> [   57.463133 <    0.000013>] 129 C   0x0840 0x2c743a40  132
> [   57.463144 <    0.000011>] 130     0x0840 0x2c744180  132
> [   57.463155 <    0.000011>] 131     0x0840 0x2c7448c0  132
> [   57.463166 <    0.000011>] 132     0x0840 0x2c745000  132
> [   57.463179 <    0.000013>] 133     0x0840 0x2c745740  132
> [   57.463190 <    0.000011>] 134     0x0840 0x2c745e80  132
> [   57.463201 <    0.000011>] 135     0x0840 0x2c7465c0  132
> [   57.463213 <    0.000012>] 136     0x0840 0x2c746d00  132
> [   57.463224 <    0.000011>] 137     0x0840 0x2c747440  132
> [   57.463235 <    0.000011>] 138     0x0840 0x2c748040  132
> [   57.463245 <    0.000010>] 139   E 0x8840 0x2c748780  132
> [   57.463256 <    0.000011>] 140   E 0x8840 0x2c748ec0  132
> [   57.463695 <    0.000244>] rx 12
> 
> As you can see, the described error is catched and the ring is dumped.
> 9 descriptors got ready before the current descriptor is ready.
> After that the current descriptor got ready and 12 packets were processed at
> once.
> I could also observe cases where the ring (512 entries) got full before the
> current descriptor was cleared.
> And also cases where the current and next descriptor were not ready.
> [   57.462752 <    0.000013>] 129 C E 0x8840 0x2c743a40  132
> [   57.462763 <    0.000011>] 130    E 0x0840 0x2c744180  132
> [   57.462775 <    0.000012>] 131     0x0840 0x2c7448c0  132
> 
> I am suspecting the errata:
> 
> ERR005783 ENET: ENET Status FIFO may overflow due to consecutive short
> frames
> Description:
> When the MAC receives shorter frames (size 64 bytes) at a rate exceeding the
> average line-rate burst traffic of 400 Mbps the DMA is able to absorb, the
> receiver might drop incoming frames before a Pause frame is issued.
> Projected Impact:
> No malfunction will result aside from the frame drops.
> Workarounds:
> The application might want to implement some flow control to ensure the
> line-rate burst traffic is below 400 Mbps if it only uses consecutive small frames
> with minimal
> (96 bit times) or short
> Inter-frame gap (IFG) time following large frames at such a high rate.
> The limit does not exist for
> frames of size larger than 800 bytes.
> Proposed Solution:
> No fix scheduled
> Linux BSP Status:
> Workaround possible but not implemented in the BSP, impacting functionality as
> described above.
> 
> Is the "ENET Status FIFO" some internal hardware FIFO or is it the descriptor
> ring.
> What would be the workaround when a "Workaround is possible"?
> 
> I could only think of skipping/dropping the descriptor when the current is still
> busy but the next one is ready.
> But it is not easily possible because the "stuck" descriptor gets ready after a
> huge delay.
> 
> Is this issue known already? Any suggestions?
> 

We don't see the issue.

Yes, the IP has the errata on i.MX6Q,  so the workaround is to enable HW flow control.
Keep HW flow control is enabled on your networking connection to avoid FIFO overrun happen.

Regards,
Andy 
> 
> Thanks in advance

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ