[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1272388439.2295.369.camel@edumazet-laptop>
Date: Tue, 27 Apr 2010 19:13:59 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: bmb@...enacr.com, therbert@...gle.com, netdev@...r.kernel.org,
rick.jones2@...com
Subject: Re: [PATCH] bnx2x: add support for receive hashing
Le mardi 27 avril 2010 à 09:51 -0700, David Miller a écrit :
> From: Brian Bloniarz <bmb@...enacr.com>
> Date: Tue, 27 Apr 2010 09:37:11 -0400
>
> > David Miller wrote:
> >> How damn hard is it to add two 16-bit ports to the hash regardless of
> >> protocol?
> >>
> > Come to think of it, for UDP the hash must ignore
> > the srcport and srcaddr, because a single bound
> > socket is going to wildcard both those fields.
>
> For load distribution we don't care if the local socket is wildcard
> bounded on source.
>
> It's going to be fully specified in the packet, and that's enough.
>
> Sure, for full RFS some amends might be necessary in this area, but
> for RPS and adapter based hw steering, using all of the ports is
> entirely sufficient.
Well well well...
I was doing some pktgen tests, with :
pgset "src_min 192.168.0.10"
pgset "src_max 192.168.0.110"
pgset "dst_min 192.168.0.2"
pgset "dst_max 192.168.0.2"
pgset "udp_dst_min 4000"
pgset "udp_dst_max 4000"
So I simulate 100 remote IPS bombarding a single port on target machine.
pktgen injects about 930.000 pps
sofirq of my target received on cpu0, and RPS spread packets to 7 other
cpus.
And my receiver is stuck (he can read about 50 pps !!!)
As soon as I disable rps, my receiver can catch 850.000 pps
RPS OFF: perf top of cpu 0
------------------------------------------------------------------------------------------------------------------------------
PerfTop: 1001 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 0)
------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________ _______
385.00 10.2% __udp4_lib_lookup vmlinux
322.00 8.5% ip_route_input vmlinux
312.00 8.3% sock_queue_rcv_skb vmlinux
262.00 6.9% do_raw_spin_lock vmlinux
251.00 6.6% __alloc_skb vmlinux
239.00 6.3% sock_put vmlinux
207.00 5.5% eth_type_trans vmlinux
202.00 5.4% __slab_alloc vmlinux
159.00 4.2% __kmalloc_track_caller vmlinux
149.00 3.9% __sk_mem_schedule vmlinux
125.00 3.3% kmem_cache_alloc vmlinux
116.00 3.1% ipt_do_table vmlinux
115.00 3.0% do_raw_read_lock vmlinux
71.00 1.9% tg3_poll_work vmlinux
65.00 1.7% __netdev_alloc_skb vmlinux
64.00 1.7% skb_pull vmlinux
58.00 1.5% ip_rcv vmlinux
58.00 1.5% __slab_free vmlinux
53.00 1.4% udp_queue_rcv_skb vmlinux
47.00 1.2% nf_iterate vmlinux
44.00 1.2% __netif_receive_skb vmlinux
29.00 0.8% sock_def_readable vmlinux
28.00 0.7% do_raw_spin_unlock vmlinux
26.00 0.7% kfree vmlinux
25.00 0.7% __udp4_lib_rcv vmlinux
24.00 0.6% ip_rcv_finish vmlinux
24.00 0.6% __list_add vmlinux
RPS, on, a perf top of a slave CPU :
------------------------------------------------------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 1)
------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________ _______
2411.00 62.0% do_raw_spin_lock vmlinux
690.00 17.7% delay_tsc vmlinux
234.00 6.0% __udp4_lib_lookup vmlinux
174.00 4.5% sock_put vmlinux
72.00 1.9% ip_rcv vmlinux
51.00 1.3% __netif_receive_skb vmlinux
43.00 1.1% do_raw_spin_unlock vmlinux
39.00 1.0% __delay vmlinux
38.00 1.0% sock_queue_rcv_skb vmlinux
36.00 0.9% udp_queue_rcv_skb vmlinux
31.00 0.8% ip_route_input vmlinux
15.00 0.4% __slab_free vmlinux
12.00 0.3% ipt_do_table vmlinux
11.00 0.3% skb_release_data vmlinux
7.00 0.2% kfree vmlinux
5.00 0.1% nf_iterate vmlinux
So we have a BIG problem :
All cpus are fighting to get the socket lock,
and very litle progress is done.
Note this problem has nothing to do with RPS, we could have
it with multiqueue as well.
Oh well...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists