[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA93jw6WWXJHz1yAffuUWZVyGBUCuRmi5gY6QGdz16FAFvR+Kw@mail.gmail.com>
Date: Sun, 1 Mar 2015 12:09:30 -0800
From: Dave Taht <dave.taht@...il.com>
To: Tom Herbert <therbert@...gle.com>
Cc: Florian Westphal <fw@...len.de>,
Eric Dumazet <eric.dumazet@...il.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 3/6] flow_dissector: Add hash_extra field to
flow_keys struct
On Sun, Mar 1, 2015 at 10:16 AM, Tom Herbert <therbert@...gle.com> wrote:
> On Sat, Feb 28, 2015 at 12:46 PM, Dave Taht <dave.taht@...il.com> wrote:
>> On Sat, Feb 28, 2015 at 12:31 PM, Florian Westphal <fw@...len.de> wrote:
>>> Eric Dumazet <eric.dumazet@...il.com> wrote:
>>>> On Fri, 2015-02-27 at 19:11 -0800, Tom Herbert wrote:
>>>>
>>>> > diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>>>> > index c605d30..d41a034 100644
>>>> > --- a/include/net/sch_generic.h
>>>> > +++ b/include/net/sch_generic.h
>>>> > @@ -252,7 +252,7 @@ struct qdisc_skb_cb {
>>>> > unsigned int pkt_len;
>>>> > u16 slave_dev_queue_mapping;
>>>> > u16 _pad;
>>>> > -#define QDISC_CB_PRIV_LEN 20
>>>> > +#define QDISC_CB_PRIV_LEN 24
>>>> > unsigned char data[QDISC_CB_PRIV_LEN];
>>>> > };
>>>> >
>>>>
>>>> This change breaks kernel build : We already are at the cb[] limit.
>>>>
>>>> Please check commit 257117862634d89de33fec74858b1a0ba5ab444b
>>>> ("net: sched: shrink struct qdisc_skb_cb to 28 bytes")
>>>
>>> I've been toying around with reducing skb->cb[] to 44 bytes,
>>> Seems Tom could integrate following patch from my test branch:
>>>
>>> http://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=skb_cb_44_01&id=29d711e1a71244b71940c2d1e346500bef4d6670
>>>
>>> It makes sfq use a smaller flow key state.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@...r.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> My concern with all this work is that you are possibly not looking at
>> the quality of the hash
>> as the number of queues goes down, or the effects of adding in all
>> this extra stuff
>> to the hash is, in cases where they don't exist, or are not very random.
>>
>> The default, in fq_codel is 1024 queues, and that worked pretty good
>> in monty carlo simulations, but I have always felt it could be better
>> after we measured more real traffic - there is not a lot of
>> information in the proto field in real traffic, and - although it has
>> been improved - the ipv6 hash was kind of weak originally and a little
>> odd now.
>>
>> As some are attempting to deploy these hashes with 64, 32 and even 8
>> queues, I would hope that someone (and I can if I get the time) would
>> look closely at avalanche effects down to these last few bits.
>>
>> http://en.wikipedia.org/wiki/Avalanche_effect
>>
> We are only increasing the input to the hash function by XOR not
> reducing, so it seems unlikely this could result in less entropy. In
> worse case extra input would might have no effect. As for the
> avalanche effect that is more dependent on the hash function itself.
> In the kernel we are using Jenkin's hash for such things, and there's
> a nice graphical representation for the avalanche effect in the
> wikipedia page:
>
> http://en.wikipedia.org/wiki/Jenkins_hash_function
I did not say you were wrong! I just said you were making me nervous. :)
Hash functions are usually evaluated by tossing random data into them
and expecting random data all the way to the least significant bit.
In the networking case there is now a significant amount of data with
low entropy tossed into the function. I would be happier from a
theoretical perspective if you just tossed every input (like the full
ipv6 addresses) into the hash itself with no xor tricks and with some
care as to what low entropy sources were being used on the low order
bits.
As an example, you get 2 bits of data from the remote port truly mixed
in at 1024 queues, and yes, jenkins should avalanche that, but I
really would prefer it be evaluated on various forms of real traffic,
not random data.
And there might be other hash functions besides jenkins better or
faster now. Although I have an interest in such things, I generally
lack time to play with the way coool! new stuff like
http://www.burtleburtle.net/bob/hash/spooky.html
and:
https://code.google.com/p/smhasher/wiki/MurmurHash
Certainly it doesn't generally matter what hash is used, so long as it
is correctly responsive to its inputs, and fast.
>
>> --
>> Dave Täht
>> Let's make wifi fast, less jittery and reliable again!
>>
>> https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb
--
Dave Täht
Let's make wifi fast, less jittery and reliable again!
https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists