netdev - Re: [Bloat] Linux network is damn fast, need more use XDP (Was: DC behaviors today)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 4 Dec 2017 18:19:09 +0100
From:   Matthias Tafelmeier <matthias.tafelmeier@....net>
To:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Dave Taht <dave@...t.net>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        bloat@...ts.bufferbloat.net,
        Christina Jacob <christina.jacob.koikara@...il.com>,
        Joel Wirāmu Pauling <joel@...ertia.net>,
        "cerowrt-devel@...ts.bufferbloat.net" 
        <cerowrt-devel@...ts.bufferbloat.net>,
        David Ahern <dsa@...ulusnetworks.com>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [Bloat] Linux network is damn fast, need more use XDP (Was: DC
 behaviors today)

Hello,
> Scaling up to more CPUs and TCP-stream, Tariq[1] and I have showed the
> Linux kernel network stack scales to 94Gbit/s (linerate minus overhead).
> But when the drivers page-recycler fails, we hit bottlenecks in the
> page-allocator, that cause negative scaling to around 43Gbit/s.
>
> [1] http://lkml.kernel.org/r/cef85936-10b2-5d76-9f97-cb03b418fd94@mellanox.com
>
> Linux have for a _long_ time been doing 10Gbit/s TCP-stream easily, on
> a SINGLE CPU.  This is mostly thanks to TSO/GRO aggregating packets,
> but last couple of years the network stack have been optimized (with
> UDP workloads), and as a result we can do 10G without TSO/GRO on a
> single-CPU.  This is "only" 812Kpps with MTU size frames.

Cannot find the reference anymore, but there was once some workshop held
by you during some netdev where you were stating that you're practially
in rigorous exchange with NIC vendors as to having them tremendously
increase the RX/TX rings(queues) numbers. Further, that there are hardly
any limits to the number other than FPGA magic/physical HW - up to
millions is viable was coined back then.  May I ask were this ended up?
Wouldn't that be key for massive parallelization either - With having a
queue(producer), a CPU (consumer)  - vice versa - per flow at the
extreme? Did this end up in this SMART-NIC thingummy? The latter is
rather trageted at XDP, no?


-- 
Besten Gruß

Matthias Tafelmeier


Download attachment "0x8ADF343B.asc" of type "application/pgp-keys" (4730 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (539 bytes)