[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87powgav0h.fsf@rasmusvillemoes.dk>
Date: Tue, 02 Feb 2016 00:08:46 +0100
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Andi Kleen <ak@...ux.intel.com>
Cc: Andi Kleen <andi@...stfloor.org>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, davidlohr.bueso@...com,
rafael.j.wysocki@...el.com, lenb@...nel.org
Subject: Re: [PATCH] Optimize int_sqrt for small values for faster idle
On Mon, Feb 01 2016, Andi Kleen <ak@...ux.intel.com> wrote:
> On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote:
>> On Thu, Jan 28 2016, Andi Kleen <andi@...stfloor.org> wrote:
>>
>> > From: Andi Kleen <ak@...ux.intel.com>
>> >
>> > The menu cpuidle governor does at least two int_sqrt() each time
>> > we go into idle in get_typical_interval to compute stddev
>> >
>> > int_sqrts take 100-120 cycles each. Short idle latency is important
>> > for many workloads.
>> >
>>
>> If you want to optimize get_typical_interval(), why not just take the
>> square root out of the equation (literally)?
>>
>> Something like
>
> Looks good. Yes that's a better fix.
>
Thanks. (Is there a good way to tell gcc that avg*avg is actually a
32x32->64 multiplication?)
While there and doing the math, I noticed that the variance computation
may _theoretically_ overflow (if half the observations are 0, half C,
the variance before the division should be around INTERVALS*C^2/4, which
is around 2^65 for C=UINT_MAX and INTERVALS=8). I have no idea if it
actually matters, but it can be fixed by lowering the initial threshold
from UINT_MAX to sqrt(4*U64_MAX/INTERVALS) ~~ 3e9. However, this would
make it possible that all observations are larger than the initial
threshold, so we'd have to protect against a division by zero...
Rasmus
Powered by blists - more mailing lists