netdev - RE: FEC on i.MX 7 transmit queue timeout

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AM4PR0401MB2260F011313883C522EA7684FFEA0@AM4PR0401MB2260.eurprd04.prod.outlook.com>
Date:   Thu, 4 May 2017 03:08:46 +0000
From:   Andy Duan <fugang.duan@....com>
To:     Stefan Agner <stefan@...er.ch>
CC:     "fugang.duan@...escale.com" <fugang.duan@...escale.com>,
        "festevam@...il.com" <festevam@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "netdev-owner@...r.kernel.org" <netdev-owner@...r.kernel.org>
Subject: RE: FEC on i.MX 7 transmit queue timeout

From: Stefan Agner <stefan@...er.ch> Sent: Thursday, May 04, 2017 9:22 AM
>To: Andy Duan <fugang.duan@....com>
>Cc: fugang.duan@...escale.com; festevam@...il.com;
>netdev@...r.kernel.org; netdev-owner@...r.kernel.org
>Subject: Re: FEC on i.MX 7 transmit queue timeout
>
>Hi Andy,
>
>On 2017-04-20 19:48, Andy Duan wrote:
>> On 2017年04月20日 07:15, Stefan Agner wrote:
>>> I tested again with imx6sx-fec compatible string. I could reproduce
>>> it on a Colibri with i.MX 7Dual. But not always: It really depends
>>> whether queue 2 is counting up or not. Just after boot, I check
>>> /proc/interrupts twice, if queue 2 is counting it will happen!
>>>
>>> But if only queue 0 is mostly in use, then it seems to work just fine.
>> If your case is only running best effort like tcp/udp, you can re-set
>> the "fsl,num-tx-queues" and "fsl,num-rx-queues" to 1 in board dts file.
>> Other two queues are for AVB audio/video queues, they have high
>> priority than queue 0. If running iperf tcp test on the three queues,
>> then the tcp segment may be out-of-order that cause net watchdog
>timeout.
>>>
>>> I also tried i.MX 7Dual SabreSD here, and the same thing. I had to
>>> reboot 3 times, then queue 2 was counting:
>>>   57:          8     GIC-0 150 Level     30be0000.ethernet
>>>   58:      20137     GIC-0 151 Level     30be0000.ethernet
>>>   59:       9269     GIC-0 152 Level     30be0000.ethernet
>>>
>>> It took me about 40 minutes on Sabre until it happened, and I had to
>>> force it using iperf, but then I got the ring dumps:
>> My board had ran more than 47 hours with nfs rootfs in 4.11.0-rc6, but
>> not running iperf.
>> I am testing with iperf.
>
>Any update on this issue?
>
>When using iperf (server) on the board with Linux 4.11 the issue appears
>within a few iperf iterations on a Sabre (TO 1.2, Board Rev C, if that matters)...
>
I don’t know whether you received my last mail. (maybe failed due to I received some rejection mails)

If your case is only running best effort like tcp/udp, you can re-set the "fsl,num-tx-queues" and "fsl,num-rx-queues" to 1 in board dts file.
Other two queues are for AVB audio/video queues, they have high priority than queue 0. If running iperf tcp test on the three queues, then the tcp segment may be out-of-order that cause net watchdog timeout.
In fsl kernel tree, there have one patch that only select the queue0 for best effort like tcp/udp. Pls test again in your board, if no problem I will upstream the patch.