[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Nov 2009 08:28:07 -0800
From: Tom Herbert <therbert@...gle.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] rps: core implementation
> I must say this is really exciting :)
>
Thanks!
>> +/* Maximum size of RPS map (for allocation) */
>> +#define RPS_MAP_SIZE (sizeof(struct rps_map) + \
>> + (num_possible_cpus() * sizeof(u16)))
>> +
>
> Problem of possible cpus is the number can be very large on some arches,
> but yet few cpus online....
>
> In this kind of situation, get_rps_cpu() will return -1 most of the time,
> defeating goal of RPS ?
>
I suppose it would make sense to either use num_online_cpus or simply
put a reasonable limit on it (like HW RSS hash tables are 128 entries
I believe).
>> + hash = jhash_3words(addr1, addr2, ports, simple_hashrnd);
>
> I wonder if you tried to exchange addr1/addr2 port1/port2 so that conntracking/routing
> is also speedup ...
>
> ie make sure hash will be the same regardless of the direction of packet.
>
> union {
> u32 port;
> u16 ports[2];
> } p;
>
> if (addr1 < addr2)
> swap(addr1, addr2);
>
> if (p.ports[0] < p.ports[1]);
> swap(p.ports[0], p.ports[1]);
>
I have not considered that. How much of a win would this be?
> hash = jhash_3words(addr1, addr2, ports, simple_hashrnd);
>
Another possibility we considered was to call inet_hashfn and
inet6_ehashfn directly to get the hash, and store that value in
skb->rxhash and use it later on connection lookup in tcp_v4_rcv to
eliminate to another jhash. This has some benefit, but it doesn't
help if we get different type of hash from HW (using that is a much
bigger win), and also we needed to pull in more IP header files into
dev.c.
>
> I think I'll try to extend your patches with TX completion recycling too.
>
> Ie record in skb the cpu number of original sender, and queue skb to
> remote queue for destruction (sock_wfree() call and expensive scheduler calls...)
>
We also have implemented a form of that if you are interested. In
dev_kfree_skb put the skb on the completion list the origin CPU of the
skb (where it was allocated) and use the remote softirq to schedule
processing.
Tom
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists