netdev - Re: Packet mmap: TX RING and zero copy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080905071754.GA25998@2ka.mipt.ru>
Date:	Fri, 5 Sep 2008 11:17:54 +0400
From:	Evgeniy Polyakov <johnpol@....mipt.ru>
To:	Johann Baudy <johaahn@...il.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Packet mmap: TX RING and zero copy

Hi Johann.

On Thu, Sep 04, 2008 at 04:44:15PM +0200, Johann Baudy (johaahn@...il.com) wrote:
> I'm finally able to run a full zero copy mechanism with UDP socket as you said.
> Unfortunately, I need at least one vmsplice() system call per UDP
> packet (vmsplice call()).
> mere vmsplice(mem to pipe) cost much (80µs of CPU). And splice(pipe to
> socket) call is worst...
> 80us is approximately the duration of 12Kbytes sent at 1Gbps. As I
> need to send packet of 7200bytes (with no frag)...
> I can't use this mechanism unfortunaltely. I've only reached 20Mbytes/s.

vmsplice() can be slow, try to inject header via usual send() call, or
better do not use it at all for testing.

> You can find below a FTRACE of vmsplice(), if you find something
> abnormal ... :) :
> (80µs result is an average of vmsplice() duration thanks to
> gettimeofday(): WITHOUT FTRACE IN KERNEL CONFIG)

Amount of gettimofday() and friends is excessive, but it can be a trace
tool itself. kill_fasync() also took too much time (top CPU user
is at bottom I suppose?), do you use SIGIO? Also vma traveling and page
checking is not what will be done in network code and your project, so
it also adds an overhead. Please try without vmsplice() at all, usual
splice()/sendfile() _has_ to saturate the link, otherwise we have a
serious problem.

> So, I will return to work on my circular buffer.
> This way I can control (ethernet frame length)*(number of frame)/
> (number of system call) ratio.

Not to distract you from the project, but you still can do the same with
existing methods and smaller amount of work. But I should be last saying
that creating tricky hacks to implement the idea should be abandoned in
favour of the standards (even slow) methods :)

-- 
	Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html