lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Dec 2017 18:19:09 +0100
From:   Matthias Tafelmeier <matthias.tafelmeier@....net>
To:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Dave Taht <dave@...t.net>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        bloat@...ts.bufferbloat.net,
        Christina Jacob <christina.jacob.koikara@...il.com>,
        Joel Wirāmu Pauling <joel@...ertia.net>,
        "cerowrt-devel@...ts.bufferbloat.net" 
        <cerowrt-devel@...ts.bufferbloat.net>,
        David Ahern <dsa@...ulusnetworks.com>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [Bloat] Linux network is damn fast, need more use XDP (Was: DC
 behaviors today)

Hello,
> Scaling up to more CPUs and TCP-stream, Tariq[1] and I have showed the
> Linux kernel network stack scales to 94Gbit/s (linerate minus overhead).
> But when the drivers page-recycler fails, we hit bottlenecks in the
> page-allocator, that cause negative scaling to around 43Gbit/s.
>
> [1] http://lkml.kernel.org/r/cef85936-10b2-5d76-9f97-cb03b418fd94@mellanox.com
>
> Linux have for a _long_ time been doing 10Gbit/s TCP-stream easily, on
> a SINGLE CPU.  This is mostly thanks to TSO/GRO aggregating packets,
> but last couple of years the network stack have been optimized (with
> UDP workloads), and as a result we can do 10G without TSO/GRO on a
> single-CPU.  This is "only" 812Kpps with MTU size frames.

Cannot find the reference anymore, but there was once some workshop held
by you during some netdev where you were stating that you're practially
in rigorous exchange with NIC vendors as to having them tremendously
increase the RX/TX rings(queues) numbers. Further, that there are hardly
any limits to the number other than FPGA magic/physical HW - up to
millions is viable was coined back then.  May I ask were this ended up?
Wouldn't that be key for massive parallelization either - With having a
queue(producer), a CPU (consumer)  - vice versa - per flow at the
extreme? Did this end up in this SMART-NIC thingummy? The latter is
rather trageted at XDP, no?


-- 
Besten Gruß

Matthias Tafelmeier


Download attachment "0x8ADF343B.asc" of type "application/pgp-keys" (4730 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (539 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ