[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1452795849.1223.112.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Thu, 14 Jan 2016 10:24:09 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>
Cc: Tom Herbert <tom@...bertland.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
David Miller <davem@...emloft.net>,
"vkuznets@...hat.com" <vkuznets@...hat.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
flow_keys layout
On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> > These results for Toeplitz are not plausible. Given random input you
> > cannot expect any hash function to produce such uniform results. I
> > suspect either your input data is biased or how your applying the hash
> > is.
> >
> > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
> > something more reasonable:
>
> IPv4 address patterns are not random. Nothing like it. A long long time
> ago we did do a bunch of tuning for network hashes using big porn site
> data sets. Random it was not.
>
I ran my tests with non random IPV4 addresses, as I had 2 hosts,
one server, one client. (typical benchmark stuff)
The only 'random' part was the ports, so maybe ~20 bits of entropy,
considering how we allocate ports during connect() to a given
destination to avoid port reuse.
> It's probably hard to repeat that exercise now with geo specific routing,
> and all the front end caches and redirectors on big sites but I'd
> strongly suggest random input is not a good test, and also that you need
> to worry more about hash attacks than perfect distributions.
Anyway, the exercise is not to find a hash that exactly splits 128 flows
into 16 buckets, according to the number of flows per bucket.
Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
kbits. There is no way the driver can predict the future.
This is why we prefer to select a queue given the cpu sending the
packet. This permits a natural shift based on actual load, and is the
default on linux (see XPS in Documentation/networking/scaling.txt)
Only this driver has a selection based on a flow 'hash'.
Powered by blists - more mailing lists