lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 07 Feb 2008 21:39:03 -0800
From:	Max Krasnyansky <maxk@...lcomm.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
CC:	netdev@...r.kernel.org, Herbert Xu <herbert@...dor.apana.org.au>,
	virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH 2/3] partial checksum and GSO support for tun/tap.

Rusty Russell wrote:
> (Changes since last time: we how have explicit IFF_RECV_CSUM and 
> IFF_RECV_GSO bits, and some renaming of virtio_net hdr)
> 
> We use the virtio_net_hdr: it is an ABI already and designed to
> encapsulate such metadata as GSO and partial checksums.
> 
> IFF_VIRTIO_HDR means you will write and read a 'struct virtio_net_hdr'
> at the start of each packet.  You can always write packets with
> partial checksum and gso to the tap device using this header.
> 
> IFF_RECV_CSUM means you can handle reading packets with partial
> checksums.  If IFF_RECV_GSO is also set, it means you can handle
> reading (all types of) GSO packets.
>
> Note that there is no easy way to detect if these flags are supported:
> see next patch.

Again sorry for delay in replying. Here are my thoughts on this.

I like the approach in general. Certainly the part that creates skbs out of the user-space
pages looks good. And it's fits nicely into existing TUN driver model.
However I actually wanted to change the model :). In particular I'm talking about 
	"syscall per packet" 
After messing around with things like libe1000.sf.net I'd like to make TUN/TAP driver look 
more like modern nic's to the user-space. In other words I'm thinking about introducing RX and
TX rings that the user-space can then mmap() and write/read packets descriptors to/from.
That will saves the number of system calls that the user-space app needs to do. That by 
itself saves a lot of overhead, combined with the GSO it's be lightning fast.

I'm going to send you a version that I cooked up awhile ago in a private email. Do not want
to spam netdev :). It's not quite the RX/TX ring model but I'll give you an idea.
I did some profiling and PPS (packets per second) numbers that user-space can handle literally 
sky rocketed.

btw We had a long discussion with Eugeniy Polakov on mapping user-pages vs mmap()ing large
kernel buffer and doing normal memcpy() (ie instead of copy_to/fromuser()) in the kernel.
On small packets overhead of get_user_pages() eats up all the benefits. So we should think
of some scheme that nicely combines the two. Kind of like "copy break" that latest net 
drivers do these days.

Also btw why call it VIRTIO ? For example I'm actually interested in speeding up tunning
and general network apps. We have wireless basestation apps here that need to handle packets
in user-space. Those kind things have nothing to with virtualization.

Max
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ