lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 12 Jul 2011 17:54:05 +0200
From:	Michał Mirosław <mirq-linux@...e.qmqm.pl>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next-2.6] net: introduce build_skb()

On Tue, Jul 12, 2011 at 05:40:16PM +0200, Eric Dumazet wrote:
> Le lundi 11 juillet 2011 à 07:46 +0200, Eric Dumazet a écrit :
> 
> > [PATCH] net: introduce build_skb()
> > 
> > One of the thing we discussed during netdev 2011 conference was the idea
> > to change network drivers to allocate/populate their skb at RX
> > completion time, right before feeding the skb to network stack.
> > 
> > Right now, we allocate skbs when populating the RX ring, and thats a
> > waste of CPU cache, since allocating skb means a full memset() to clear
> > the skb and its skb_shared_info portion. By the time NIC fills a frame
> > in data buffer and host can get it, cpu probably threw away the cache
> > lines from its caches, because of huge RX ring sizes.
> > 
> > So the deal would be to allocate only the data buffer for the NIC to
> > populate its RX ring buffer. And use build_skb() at RX completion to
> > attach a data buffer (now filled with an ethernet frame) to a new skb,
> > initialize the skb_shared_info portion, and give the hot skb to network
> > stack.
> 
> Update :
> 
> First results are impressive : About 15% of throughput increase with igb
> driver on my small desktop machine, and I am limited by the wire
> speed :)
> 
> (AMD Athlon(tm) II X2 B24 Processor, 3GHz, cache size : 1024K)
> 
> setup : One dual port Intel card : Ethernet controller: Intel
> Corporation 82576 Gigabit Network Connection (rev 01)
> 
> eth1 direct attach on eth2, Gigabit speed.
> eth2 RX ring set to 4096 slots (default is 256)
> 
> CPU0 : pktgen sending on eth1, line rate (1488137pps)
> CPU1 : receive eth2 interrupts, packets dropped into raw netfilter table
> to bypass upper stacks.
> 
> Before patch : 15% packet losses, ksoftirqd/1 using 100% of cpu
> After patch : residual losses (less than 0.1 %), ksoftirqd not used, 80%
> cpu used 
> 
> I'll do more tests with a 10Gb card (ixgbe driver) to not be wire
> limited.

I remember observing similar increase after switching from allocating skb
to allocating pages and using napi_get_frags() + napi_gro_frags(). That
was with sl351x driver posted for review some time ago.

Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ