netdev - Re: [PATCH net-next-2.6] rps: consistent rxhash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 21 Apr 2010 13:06:17 +0200
From:	Franco Fichtner <franco@...tsummer.de>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Tom Herbert <therbert@...gle.com>, Changli Gao <xiaosuo@...il.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next-2.6] rps: consistent rxhash

Eric Dumazet wrote:
> Le mercredi 21 avril 2010 à 11:29 +0200, Franco Fichtner a écrit :
>> Tom Herbert wrote:
>>>> I thought about this for some time...
>>>>
>>>> Do we really need the port numbers here at all? A simple
>>>> addr1^addr2 can provide a good enough pointer for
>>>> distribution amongst CPUs.
>>>>
>>> What about a server behind a TCP proxy?  Also, need to minimize
>>> collisions for RPS to be effective
>> What about routers? What about loopback? This all boils down to
>> the same issue of obscuring IP data by "magical" means and then
>> reattaching functionality by reaching for upper layer information.
>> It is necessary in some cases, but it can cripple performance
>> for other cases.
>>
>> The interesting thing is you don't need to deal with collisions
>> while distributing amonst cpus at all. You just need to make sure
>> the distribution algorithm keeps every single flow attached to
>> the correct cpu.
>>
>> All of the actual flow hashing, tracking and whatever else the
>> traffic needs to go through can be done locally by cpu x which
>> helps a lot with load distribution and cache issues in mind. It
>> also helps locking because there is no global flow lookup table.
>> Oh, and it also reduces collisions with every cpu you add for
>> receiving.
>>
>> I work with a lot of plain office and ISP traffic in mind daily,
>> so please don't misunderstand my motivation here. I'd hate to
>> see poor performance in scenarios in which there is a lot of
>> potential improvement.
>>
> 
> I am a bit lost by this conversation.
> 
> Are you saying something is wrong with current schem ?
> 
> What are exactly your suggestions ?
> 

Hashing for cpu distribution should be as minimal as it could possibly
be with the least number operations needed to compute a hash, which 
normally involves touching one cold cache line (ip header). If you add 
the ports to your mix you have the luxury of solving static ip mappings,
but only for protocols that support it. Usage of the destination port
may also prove to be more or less pointless with a lot of http traffic,
because it's most likely static. And you add another potential cold
cache line access. For a lot of traffic scenarios, we'll have a bunch of
internal ips and the internet on the other side, so having a simple hash
based on a flavor if internal/external ip is more than enough to work
with for distribution. If the network card can provide a complete hash
all the better. Then this part of my point is void.

But then, hashing for cpu distribution should have nothing todo with
real flow tracking with lookup tables for let's say a firewall or dpi
application, because that data is only needed by local cpu and can
be gathered after distribution. Simply put, the lookup for the flow, if
it is needed, does not belong to distribution. It can be outsourced to
the destination cpu or just simply be ignored, if the application
doesn't care.

> Tom replied to you that a hash derived from (addr1 ^ addr2) would not
> work in situations where all flows goes from machine A to machine B
> (all hashes would be the same)
> 

Yes, I see the point. And all I'm just asking if it's wise to optimize
for this particular scenario.

If you spin this idea further beyond flow tracking, maybe an application
also needs to do some kind of user tracking by ip. Wouldn't it make
sense to have user based flows on a more local basis, not a global one
because ports will get in the way?

> Current hash is probably more than enough to cover all situations.
> 

I agree with this, but would like to point out the phrasing "probably
more than enough". :)

Franco
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html