lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Jun 2020 18:29:06 +0200
From:   "Tobias Waldekranz" <tobias@...dekranz.com>
To:     "Andy Duan" <fugang.duan@....com>,
        "David Miller" <davem@...emloft.net>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [EXT] Re: [PATCH net-next] net: ethernet: fec: prevent tx
 starvation under high rx load

On Sun Jun 28, 2020 at 8:23 AM CEST, Andy Duan wrote:
> I never seem bandwidth test cause netdev watchdog trip.
> Can you describe the reproduce steps on the commit, then we can
> reproduce it
> on my local. Thanks.

My setup uses a i.MX8M Nano EVK connected to an ethernet switch, but
can get the same results with a direct connection to a PC.

On the iMX, configure two VLANs on top of the FEC and enable IPv4
forwarding.

On the PC, configure two VLANs and put them in different
namespaces. From one namespace, use trafgen to generate a flow that
the iMX will route from the first VLAN to the second and then back
towards the second namespace on the PC.

Something like:

    {
        eth(sa=PC_MAC, da=IMX_MAC),
        ipv4(saddr=10.0.2.2, daddr=10.0.3.2, ttl=2)
        udp(sp=1, dp=2),
        "Hello world"
    }

Wait a couple of seconds and then you'll see the output from fec_dump.

In the same setup I also see a weird issue when running a TCP flow
using iperf3. Most of the time (~70%) when i start the iperf3 client
I'll see ~450Mbps of throughput. In the other case (~30%) I'll see
~790Mbps. The system is "stably bi-modal", i.e. whichever rate is
reached in the beginning is then sustained for as long as the session
is kept alive.

I've inserted some tracepoints in the driver to try to understand
what's going on: https://svgshare.com/i/MVp.svg

What I can't figure out is why the Tx buffers seem to be collected at
a much slower rate in the slow case (top in the picture). If we fall
behind in one NAPI poll, we should catch up at the next call (which we
can see in the fast case). But in the slow case we keep falling
further and further behind until we freeze the queue. Is this
something you've ever observed? Any ideas?

Thank you

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ