netdev - Re: [PATCH] net: implement emergency route cache rebulds when gc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081007105459.GA1455@hmsreliant.think-freely.org>
Date:	Tue, 7 Oct 2008 06:54:59 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	Bill Fink <billfink@...dspring.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	kuznet@....inr.ac.ru, pekkas@...core.fi, jmorris@...ei.org,
	yoshfuji@...ux-ipv6.org, kaber@...sh.net,
	Evgeniy Polyakov <johnpol@....mipt.ru>
Subject: Re: [PATCH] net: implement emergency route cache rebulds when
	gc_elasticity is exceeded

On Tue, Oct 07, 2008 at 07:13:29AM +0200, Eric Dumazet wrote:
> Neil Horman a écrit :
>> On Mon, Oct 06, 2008 at 11:21:38PM +0200, Eric Dumazet wrote:
>>> Neil Horman a écrit :
>>>> So, I've been playing with this patch, and I've not figured out eactly whats
>>>> bothering me yet, since the math seems right, but something doesn't seem right
>>>> about the outcome of this algorithm.  I've tested with my local system, and all
>>>> works well, because the route cache is well behaved, and the sd value always
>>>> works out to be very small, so ip_rt_gc_elasticity is used.  So I've been
>>>> working through some scenarios by hand to see what this looks like using larger
>>>> numbers.  If i assume ip_rt_gc_interval is 60, and rt_hash_log is 17, my sample
>>>> count here is 7864320 samples per run.  If within that sample 393216 (about 4%)
>>>> of the buckets have one entry on the chain, and all the rest are zeros, my hand
>>>> calculations result in a standard deviation of approximately 140 and an average
>>>> of .4.  That imples that in that sample set any one chain could be almost 500
>>>> entires long before it triggered a cache rebuld.  Does that seem reasonable?
>>>>
>>> if rt_hash_log is 17, and interval is 60, then you should scan (60 << 
>>> 17)/300 slots. That's 26214 slots. (ie 20% of the 2^17 slots)
>>>
>>> I have no idea how you can have sd = 140, even if scaled by (1 << 3)
>>> with slots being empty or with one entry only...
>>>
>> I don't either, that was my concern :).
>>
>>> If 4% of your slots have one element, then average length is 0.04 :)
>>>
>> Yes, and the average worked out properly, which is why I was concerned.
>>
>> If you take an even simpler case, like you state above (I admit I miseed the
>> /300 part of the sample, but no matter).
>>
>> samples = 26214
>> Assume each sample has a chain length of 1
>>
>> sum = 26214 * (1 << 3) = 209712
>> sum2 = sum * sum = s09712 * 209712 = 43979122944
>> avg = sum / samples = 209712 / 26214 = 8 (correct)
>> sd = sqrt(sum2 / samples - avg*avg) = sqrt(43979122944/26214 - 64) = 1295
>> sd >> 3 = 1295.23 >> 3 = 161
>>
>>
>> Clearly, given the assumption that every chain in the sample set has 1 entry,
>> giving us an average of one, the traditional method of computing standard
>> deviation should have yielded an sd of 0 exactly, since every sample was
>> precisely the average. However, the math above gives us something significantly
>> larger.  I'm hoping I missed something, but I don't think I have.  
>>
>
> Famous last sentence ;)
>
> You made some errors in your hand calculations.
> sum2 is the sum of squares. Its not sum * sum.
>
> If all slots have one entry, all "lengths" are (1<<3), and their 'square 
> scaled by 6' is (1 << 6) . So sum2 = 26214 * (1 << 6) = 1677696
> avg = sum / samples = 209712 / 26214 = (1 << 3)
> sd = sqrt(sum2 / samples - avg*avg) = sqrt(64 - 64) = 0 (this is what we expected)
>
> Now if you have 4 % of slots with one entry, and 96 % that are empty,
> you should have /* real maths */
> avg = 0.04
> sd = sqrt(0.04 - 0.04*0.04) = sqrt(0.0384) = 0.195959
> avg + 4*sd = 0.82
>
> /* fixed point math */
> sum = 0.04 * 26214 * (1<<3) = 1048 * (1<<3) = 8384 sum2 = 1048 * (1 << 6) 
> = 67072
> avg << 3 = 8384/26214 = 0   (with 3 bits for fractional part, we do have rounding error)
> sd << 3 = sqrt(67072/26214 - 0) = 1
> (avg + 4*sd) << 3 = 4  -> final result is 4>>3 = 0 (expected)
>
> Now if 50% of slots have one entry, we get :
> /* real maths */
> avg = 0.5
> sd = sqrt(0.5 - 0.5*0.5) = sqrt(0.25) = 0.5
> avg + 4*sd = 2.5
>
> /* fixed point math */
> sum = 0.5 * 26214 * (1<<3) = 104856
> sum2 = 13107 * (1<<6) = 838848
> avg << 3 = 104856/26214 = 4
> sd << 3 = sqrt(838848/26214 - 4*4) = sqrt(32 - 16) = 4 (avg + 4*sd) << 3 
> = 20  -> final result is 20>>3 = 2 (expected)
>
>
> Hope this helps
>


Gaaa! Yes, embarrasing as it may be, that clarifies it for me.  Sorry, I missed
the the fact that sum2 is the sum of squares rather than the sqare of the sum.
Stupid of me.  Ok, so this actually works rather well then. I'll keep testing.

Thanks!
Neil

>
>
>

-- 
/****************************************************
 * Neil Horman <nhorman@...driver.com>
 * Software Engineer, Red Hat
 ****************************************************/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html