lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e0dd21a0809040744q3ee65695uc33f6cc26a1fe4dd@mail.gmail.com>
Date:	Thu, 4 Sep 2008 16:44:15 +0200
From:	"Johann Baudy" <johaahn@...il.com>
To:	"Evgeniy Polyakov" <johnpol@....mipt.ru>
Cc:	"David Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Packet mmap: TX RING and zero copy

Hi Evgeniy,


> Looks like you try to sendfile() over packet socket.
> Both tcp and udp sockets have sendpage method.
>
> Or your hardware or driver do not support needed fucntionality, so
> tcp_sendpage() falls back to sock_no_sendpage(). From your dump I think
> it is the first case above. Well, after I read it again, I found word
> packet_sendmsg(), which explains everything. Please use tcp or udp
> socket for splice/sendfile test.
>

I'm finally able to run a full zero copy mechanism with UDP socket as you said.
Unfortunately, I need at least one vmsplice() system call per UDP
packet (vmsplice call()).
mere vmsplice(mem to pipe) cost much (80µs of CPU). And splice(pipe to
socket) call is worst...
80us is approximately the duration of 12Kbytes sent at 1Gbps. As I
need to send packet of 7200bytes (with no frag)...
I can't use this mechanism unfortunaltely. I've only reached 20Mbytes/s.

You can find below a FTRACE of vmsplice(), if you find something
abnormal ... :) :
(80µs result is an average of vmsplice() duration thanks to
gettimeofday(): WITHOUT FTRACE IN KERNEL CONFIG)

      main-849   [00] ..  1 4154502892.139088: sys_gettimeofday
<-ret_from_syscall
            main-849   [00] ..  1 4154502892.139090: do_gettimeofday
<-sys_gettimeofday
            main-849   [00] ..  1 4154502892.139092: getnstimeofday
<-do_gettimeofday
            main-849   [00] ..  1 4154502892.139100: sys_vmsplice
<-ret_from_syscall
            main-849   [00] ..  1 4154502892.139107: fget_light <-sys_vmsplice
            main-849   [00] ..  1 4154502892.139118: rt_down_read <-sys_vmsplice
            main-849   [00] ..  1 4154502892.139120: __rt_down_read
<-rt_down_read
            main-849   [00] ..  1 4154502892.139124:
rt_mutex_down_read <-__rt_down_read
            main-849   [00] ..  1 4154502892.139132: pagefault_disable
<-sys_vmsplice
            main-849   [00] ..  1 4154502892.139136: pagefault_enable
<-sys_vmsplice
            main-849   [00] ..  1 4154502892.139141: get_user_pages
<-sys_vmsplice
            main-849   [00] ..  1 4154502892.139147: find_extend_vma
<-get_user_pages
            main-849   [00] ..  1 4154502892.139150: find_vma <-find_extend_vma
            main-849   [00] ..  1 4154502892.139158: _cond_resched
<-get_user_pages
            main-849   [00] ..  1 4154502892.139161: follow_page
<-get_user_pages
            main-849   [00] ..  1 4154502892.139165: rt_spin_lock <-follow_page
            main-849   [00] ..  1 4154502892.139167: __rt_spin_lock
<-rt_spin_lock
            main-849   [00] ..  1 4154502892.139171: vm_normal_page
<-follow_page
            main-849   [00] ..  1 4154502892.139176:
mark_page_accessed <-follow_page
            main-849   [00] ..  1 4154502892.139180: rt_spin_unlock
<-follow_page
            main-849   [00] ..  1 4154502892.139185: flush_dcache_page
<-get_user_pages
            main-849   [00] ..  1 4154502892.139192: rt_up_read <-sys_vmsplice
            main-849   [00] ..  1 4154502892.139194: rt_mutex_up_read
<-rt_up_read
            main-849   [00] ..  1 4154502892.139203: splice_to_pipe
<-sys_vmsplice
            main-849   [00] ..  1 4154502892.139206: _mutex_lock
<-splice_to_pipe
            main-849   [00] ..  1 4154502892.139209: rt_mutex_lock <-_mutex_lock
            main-849   [00] ..  1 4154502892.139217: _mutex_unlock
<-splice_to_pipe
            main-849   [00] ..  1 4154502892.139221: rt_mutex_unlock
<-_mutex_unlock
            main-849   [00] ..  1 4154502892.139224: kill_fasync
<-splice_to_pipe
            main-849   [00] ..  1 4154502892.139235: sys_gettimeofday
<-ret_from_syscall
            main-849   [00] ..  1 4154502892.139237: do_gettimeofday
<-sys_gettimeofday
            main-849   [00] ..  1 4154502892.139239: getnstimeofday
<-do_gettimeofday


So, I will return to work on my circular buffer.
This way I can control (ethernet frame length)*(number of frame)/
(number of system call) ratio.

Thanks to splice kernel and pktgen code analyses, I've also found a
clean way to perform
zero copy between my circular buffer and socket buffer. I will test it
and I'll let you know
changes and results.

Many thanks for your help,
Johann Baudy









-- 
Johann Baudy
johaahn@...il.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ