[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A26E4FD.5010405@gmail.com>
Date: Wed, 03 Jun 2009 23:02:53 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Rusty Russell <rusty@...tcorp.com.au>
CC: netdev@...r.kernel.org, virtualization@...ts.linux-foundation.org,
Divy Le Ray <divy@...lsio.com>,
Roland Dreier <rolandd@...co.com>,
Pavel Emelianov <xemul@...nvz.org>,
Dan Williams <dcbw@...hat.com>,
libertas-dev@...ts.infradead.org
Subject: Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit
Rusty Russell a écrit :
> On Sat, 30 May 2009 12:41:00 am Eric Dumazet wrote:
>> Rusty Russell a écrit :
>>> DaveM points out that there are advantages to doing it generally (it's
>>> more likely to be on same CPU than after xmit), and I couldn't find
>>> any new starvation issues in simple benchmarking here.
>> If really no starvations are possible at all, I really wonder why some
>> guys added memory accounting to UDP flows. Maybe they dont run "simple
>> benchmarks" but real apps ? :)
>
> Well, without any accounting at all you could use quite a lot of memory as
> there are many places packets can be queued.
>
>> For TCP, I agree your patch is a huge benefit, since its paced by remote
>> ACKS and window control
>
> I doubt that. There'll be some cache friendliness, but I'm not sure it'll be
> measurable, let alone "huge". It's the win to drivers which don't have a
> timely and batching tx free mechanism which I aim for.
At 250.000 packets/second on a Gigabit link, this is huge, I can tell you.
(250.000 incoming packets and 250.000 outgoing packets per second, 700 Mbit/s)
According to this oprofile on CPU0 (dedicated to softirqs on one bnx2 eth adapter)
We can see sock_wfree() being number 2 on the profile, because it touches three cache lines per socket and
transmited packet in TX completion handler.
Also, taking a reference on socket for each xmit packet in flight is very expensive, since it slows
down receiver in __udp4_lib_lookup(). Several cpus are fighting for sk->refcnt cache line.
CPU: Core 2, speed 3000.24 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples cum. samples % cum. % symbol name
21215 21215 11.8847 11.8847 bnx2_poll_work
17239 38454 9.6573 21.5420 sock_wfree << effect of udp memory accounting >>
14817 53271 8.3005 29.8425 __slab_free
14635 67906 8.1986 38.0411 __udp4_lib_lookup
11425 79331 6.4003 44.4414 __alloc_skb
9710 89041 5.4396 49.8810 __slab_alloc
8095 97136 4.5348 54.4158 __udp4_lib_rcv
7831 104967 4.3869 58.8027 sock_def_write_space
7586 112553 4.2497 63.0524 ip_rcv
7518 120071 4.2116 67.2640 skb_dma_unmap
6711 126782 3.7595 71.0235 netif_receive_skb
6272 133054 3.5136 74.5371 udp_queue_rcv_skb
5262 138316 2.9478 77.4849 skb_release_data
5023 143339 2.8139 80.2988 __kmalloc_track_caller
4070 147409 2.2800 82.5788 kmem_cache_alloc
3216 150625 1.8016 84.3804 ipt_do_table
2576 153201 1.4431 85.8235 skb_queue_tail
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists