lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1606161700150.5839@nanos>
Date:	Thu, 16 Jun 2016 17:43:36 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Arjan van de Ven <arjanvandeven@...il.com>
cc:	Eric Dumazet <edumazet@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Chris Mason <clm@...com>,
	Arjan van de Ven <arjan@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	George Spelvin <linux@...encehorizons.net>
Subject: Re: [patch 13/20] timer: Switch to a non cascading wheel

On Wed, 15 Jun 2016, Thomas Gleixner wrote:
> On Wed, 15 Jun 2016, Arjan van de Ven wrote:
> > what would 1 more timer wheel do?
> 
> Waste storage space and make the collection of expired timers more expensive.
> 
> The selection of the timer wheel properties is combination of:
> 
>     1) Granularity 
> 
>     2) Storage space
> 
>     3) Number of levels to collect

So I came up with a slightly different solution for this. The problem case is
HZ=1000 and again looking at the data, there is no reason why we need actual
1ms granularity for timer wheel timers. That's independent of the desired ms
based interfaces.

We can simply run the wheel internaly with 4ms base level resolution and
degrade from there. That gives us 6 days+ and a simple cutoff at the capacity
of the 7th level wheel.

 0     0        4 ms               0 ms -        255 ms		    
 1    64       32 ms             256 ms -       2047 ms (256ms - ~2s)
 2   128      256 ms            2048 ms -      16383 ms (~2s - ~16s) 
 3   192     2048 ms (~2s)     16384 ms -     131071 ms (~16s - ~2m)
 4   256    16384 ms (~16s)   131072 ms -    1048575 ms (~2m - ~17m)
 5   320   131072 ms (~2m)   1048576 ms -    8388607 ms (~17m - ~2h)
 6   384  1048576 ms (~17m)  8388608 ms -   67108863 ms (~2h - ~18h)
 7   448  8388608 ms (~2h)  67108864 ms -  536870911 ms (~18h - ~6d)

That works really nice and has the interesting side effect that we batch in
the first level wheel which helps networking. I'll repost the series with the
other review points addressed later tonight.

Btw, I also thought a bit more about the milliseconds interfaces. I think we
shouldn't invent new interfaces. The correct solution IMHO is to distangle the
scheduler tick frequency and jiffies. If we have that completely seperated
then we can do the following:

1) Force HZ=1000. That means jiffies and timer wheel units are 1ms. If the
   tick frequency is != 1000 we simply increment jiffies in the tick by the
   proper amount (4 @250 ticks/sec, 10 @100 ticks/sec).

   So all msec_to_jiffies() invocations compile out into nothing magically and
   we can remove them gradually over time.

2) When we do that right, we can make the tick frequency a command line option
   and just have a compiled in default.

Thoughts?

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ