netdev - Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090915205207.GI22743@ghostprotocols.net>
Date:	Tue, 15 Sep 2009 17:52:07 -0300
From:	Arnaldo Carvalho de Melo <acme@...hat.com>
To:	Nir Tzachar <nir.tzachar@...il.com>
Cc:	Linux Networking Development Mailing List 
	<netdev@...r.kernel.org>, Ziv Ayalon <ziv@...al.co.il>
Subject: Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall

Em Tue, Sep 15, 2009 at 09:20:13PM +0300, Nir Tzachar escreveu:
> >> Setup:
> >> linux 2.6.29.2 with the third version of the patch, running on an
> >> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
> >> 9.04
> >
> > Which NIC? 10 Gbit/s?
> 
> 1G. We do not care as much for throughput as we do about latency...

OK, but anyway the 10 Gbit/s cards I've briefly played with all
exhibited lower latencies than all 1 gbit/s ones, in fact I've heard
about people moving to 10 Gbit/s not for the bw, but for the lower
latencies :-)
 
> 
> >> Results:
> >> On general, the recvmmsg beats the pants off the regular recvmsg by a
> >> whole millisecond (which might not sound much, but is _really_ a lot
> >> for us ;). The exact distribution fluctuates between half a milli and
> >> 2 millis, but the average is 1 milli.
> >
> > Do you have any testcase using publicly available software? Like qpidd,
> > etc? I'll eventually have to do that, for now I'm just using that
> > recvmmsg tool I posted, now with a recvmsg mode, then collecting 'perf
> > record' with and without callgraphs to post here. The client is just
> > pktgen spitting datagrams as if there is no tomorrow :-)
> 
> No. This was on a live, production system.

Wow :-)
> 
> > Showing that we get latency improvements is complementary to what I'm
> > doing, that is for now just showing the performance improvements and
> > showing what gives this improvement (perf counters runs).
> 
> We are more latency oriented, and, naturally, concentrate on this
> aspect of the problem. Producing numbers here is much more easier....
> I can easily come up with a test application which just measures the
> latency of processing packets, by employing a sending loop between two
> hosts.
> 
> > If you could come up with a testcase that you could share with us,
> > perhaps using one of these AMQP implementations, that would be great
> > too.
> 
> Well, in our experience, AMQP and other solutions have latency issues.
> Moreover, the receiving end of our application is a regular multicast
> stream. I will implement the simple latency test I mentioned earlier,
> and post some results soon.

OK.

And here are some callgraphs for a very simple app (attached) that stops
after receiving 1 million datagrams, collected with the perf tools in
the kernel. No packets per sec numbers besides the fact that the
recvmmsg test run collected way less samples (finished quicker).

Client is pktgen sending 100 byte datagrams over a single tg3 1 Gbit/s
NIC, server is running over a bnx2 1 Gbit/s link as well, just a sink:

With recvmmsg, batch of 8 datagrams, no timeout:

http://oops.ghostprotocols.net:81/acme/perf.recvmmsg.step1.cg.data.txt.bz2

And with recvmsg:

http://oops.ghostprotocols.net:81/acme/perf.recvmsg.step1.cg.data.txt.bz2

Notice where we are spending time in the recvmmsg case... :-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html