netdev - RE: [EXT] Re: [PATCH net-next] net: ethernet: fec: prevent tx starvation under high rx load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <C3U9EFL9CA15.QDKTU9Y4EZXM@wkz-x280>
Date:   Tue, 30 Jun 2020 09:30:41 +0200
From:   "Tobias Waldekranz" <tobias@...dekranz.com>
To:     "Andy Duan" <fugang.duan@....com>,
        "David Miller" <davem@...emloft.net>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [EXT] Re: [PATCH net-next] net: ethernet: fec: prevent tx
 starvation under high rx load

On Tue Jun 30, 2020 at 8:27 AM CEST, Andy Duan wrote:
> From: Tobias Waldekranz <tobias@...dekranz.com> Sent: Tuesday, June 30,
> 2020 12:29 AM
> > On Sun Jun 28, 2020 at 8:23 AM CEST, Andy Duan wrote:
> > > I never seem bandwidth test cause netdev watchdog trip.
> > > Can you describe the reproduce steps on the commit, then we can
> > > reproduce it on my local. Thanks.
> > 
> > My setup uses a i.MX8M Nano EVK connected to an ethernet switch, but can
> > get the same results with a direct connection to a PC.
> > 
> > On the iMX, configure two VLANs on top of the FEC and enable IPv4
> > forwarding.
> > 
> > On the PC, configure two VLANs and put them in different namespaces. From
> > one namespace, use trafgen to generate a flow that the iMX will route from
> > the first VLAN to the second and then back towards the second namespace on
> > the PC.
> > 
> > Something like:
> > 
> >     {
> >         eth(sa=PC_MAC, da=IMX_MAC),
> >         ipv4(saddr=10.0.2.2, daddr=10.0.3.2, ttl=2)
> >         udp(sp=1, dp=2),
> >         "Hello world"
> >     }
> > 
> > Wait a couple of seconds and then you'll see the output from fec_dump.
> > 
> > In the same setup I also see a weird issue when running a TCP flow using
> > iperf3. Most of the time (~70%) when i start the iperf3 client I'll see
> > ~450Mbps of throughput. In the other case (~30%) I'll see ~790Mbps. The
> > system is "stably bi-modal", i.e. whichever rate is reached in the beginning is
> > then sustained for as long as the session is kept alive.
> > 
> > I've inserted some tracepoints in the driver to try to understand what's going
> > on:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsvgsha
> > re.com%2Fi%2FMVp.svg&amp;data=02%7C01%7Cfugang.duan%40nxp.com%
> > 7C12854e21ea124b4cc2e008d81c59d618%7C686ea1d3bc2b4c6fa92cd99c5c
> > 301635%7C0%7C0%7C637290519453656013&amp;sdata=by4ShOkmTaRkFfE
> > 0xJkrTptC%2B2egFf9iM4E5hx4jiSU%3D&amp;reserved=0
> > 
> > What I can't figure out is why the Tx buffers seem to be collected at a much
> > slower rate in the slow case (top in the picture). If we fall behind in one NAPI
> > poll, we should catch up at the next call (which we can see in the fast case).
> > But in the slow case we keep falling further and further behind until we freeze
> > the queue. Is this something you've ever observed? Any ideas?
>
> Before, our cases don't reproduce the issue, cpu resource has better
> bandwidth
> than ethernet uDMA then there have chance to complete current NAPI. The
> next,
> work_tx get the update, never catch the issue.

It appears it has nothing to do with routing back out through the same
interface.

I get the same bi-modal behavior if just run the iperf3 server on the
iMX and then have it be the transmitting part, i.e. on the PC I run:

    iperf3 -c $IMX_IP -R

I would be very interesting to see what numbers you see in this
scenario.