[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48E4832A.3070804@cosmosbay.com>
Date: Thu, 02 Oct 2008 10:15:38 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Bill Fink <billfink@...dspring.com>
Cc: Neil Horman <nhorman@...driver.com>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
kuznet@....inr.ac.ru, pekkas@...core.fi, jmorris@...ei.org,
yoshfuji@...ux-ipv6.org, kaber@...sh.net,
Evgeniy Polyakov <johnpol@....mipt.ru>
Subject: Re: [PATCH] net: implement emergency route cache rebulds when gc_elasticity
is exceeded
Eric Dumazet a écrit :
> Bill Fink a écrit :
>> On Wed, 1 Oct 2008, Neil Horman wrote:
>>
>>> Hey all-
>>> Since Eric mentioned the use of statistical analysis to determine if
>>> hash chains were growing improperly, I thought I would take a stab at
>>> such an
>>> approach. I'm no statistics expert, but it would seem to me that simply
>>> computing the standard deviation of all the non-zero chain lengths
>>> would give a
>>> good guide point to determine if we needed to invalidate our hash
>>> table. I've
>>> written the below implementation. I've not tested it (I'll be doing
>>> that here
>>> shortly for the next few days), but I wanted to post it to get
>>> feedback from you
>>> all on the subject. The highlights are:
>>>
>>> 1) We add a counter to rt_hash_bucket structure to track the length
>>> of each
>>> individual chain. I realize this is sub-optimal, as it adds
>>> potentially lots of
>>> memory to hash table as a whole (4 bytes * number of hash buckets).
>>> I'm afraid
>>> I've not come up with a better way to track that yet. We also track
>>> the total
>>> number of route entries and the number of non-zero length chains.
>>> Lastly we
>>> also maintain a global maximum chain length which defines the longest
>>> chain we
>>> will tolerate in the route table. This patch defines it as the mean
>>> chain
>>> length plus one standard deviation.
>>
>> I believe the general rule of thumb for something like this is at
>> least two standard deviations. For a normal distribution, one standard
>> deviation covers about 68 % of the sample universe, while two standard
>> deviations covers about 95 % (three standard deviations covers 99.73 %).
>> See the Wikipedia entry:
>>
>> http://en.wikipedia.org/wiki/Standard_deviation
>>
>
> Thanks Bill for the pointer, this is the trick.
>
> I believe we should target "4σ 99.993666% " case.
>
> But we dont need to really compute Standard deviation at runtime, only
> find an (upper) approximation of it.
>
> For elasticity=4 and 512*1024 samples (mean < 4), I guess 4σ can be
> approximated by 20 or something.
>
Good estimation of Standard Deviation could be computed for free
in rt_check_expire(). (each runs scans 20% of hash table with
default tunables timeout & ip_rt_gc_interval)
We could update 4σ estimation in this function, every minute (ip_rt_gc_interval)
At softirq time we then can detect a particular hash chain is
longer than 4σ estimation and trigger an appropriate action.
This action is to : flush table, and while we do that, expand hash table
if its current size is under ip_rt_max_size/elasticity...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists