[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <568E8782.9020401@linaro.org>
Date: Thu, 7 Jan 2016 16:42:58 +0100
From: Daniel Lezcano <daniel.lezcano@...aro.org>
To: Nicolas Pitre <nicolas.pitre@...aro.org>
Cc: tglx@...utronix.de, peterz@...radead.org, rafael@...nel.org,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
vincent.guittot@...aro.org
Subject: Re: [RFC PATCH 2/2] sched: idle: IRQ based next prediction for idle
period
On 01/06/2016 06:40 PM, Nicolas Pitre wrote:
> On Wed, 6 Jan 2016, Daniel Lezcano wrote:
>
>> Many IRQs are quiet most of the time, or they tend to come in bursts of
>> fairly equal time intervals within each burst. It is therefore possible
>> to detect those IRQs with stable intervals and guestimate when the next
>> IRQ event is most likely to happen.
>>
>> Examples of such IRQs may include audio related IRQs where the FIFO size
>> and/or DMA descriptor size with the sample rate create stable intervals,
>> block devices during large data transfers, etc. Even network streaming
>> of multimedia content creates patterns of periodic network interface IRQs
>> in some cases.
>>
>> This patch adds code to track the mean interval and variance for each IRQ
>> over a window of time intervals between IRQ events. Those statistics can
>> be used to assist cpuidle in selecting the most appropriate sleep state
>> by predicting the most likely time for the next interrupt.
>>
>> Because the stats are gathered in interrupt context, the core computation
>> is as light as possible.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@...aro.org>
>
> There are still a few problems with this patch.
>
> Please see comments below.
[ ... ]
>> +/**
>> + * stats_variance - compute the variance
>> + *
>> + * @s: the statistic structure
>> + *
>> + * Returns an u64 corresponding to the variance, or zero if there is
>> + * no data
>> + */
>> +static u64 stats_variance(struct stats *s, u32 mean)
>> +{
>> + int i;
>> + u64 variance = 0;
>> +
>> + /*
>> + * The variance is the sum of the squared difference to the
>> + * average divided by the number of elements.
>> + */
>> + for (i = 0; i < STATS_NR_VALUES; i++) {
>> + s32 diff = s->values[i] - mean;
>> + variance += (u64)diff * diff;
>
> Strictly speaking, diff should be casted to s64 as it is a signed value
> that may actually be negative. Because of the strange C type promotion
> rules, the generated code appears correct (at least on ARM), but it
> would be clearer to use s64 anyway.
I don't get the connection in your explanation of why it should be a
s64. It is already a signed s32, s->values[] are s32 and mean is u32.
What would be the benefit to convert diff to s64 ?
> The product will end up being positive in all cases of course so
> variance may remain as a u64.
>
[ ... ]
>> +static ktime_t next_irq_event(void)
>> +{
>> + unsigned int irq, cpu = raw_smp_processor_id();
>> + ktime_t diff, next, min = { .tv64 = KTIME_MAX };
>
> To avoid exposing the ktime_t definition, you should use
> ktime_set(KTIME_SEC_MAX, 0) instead of { .tv64 = KTIME_MAX }.
Ok.
>> + struct wakeup *w;
>> + u32 interval, mean;
>> + u64 variance;
>> +
>> + /*
>> + * Lookup the interrupt array for this cpu and search for the
>> + * earlier expected interruption.
>> + */
>> + for (irq = 0; irq < NR_IRQS; irq++) {
>> +
>> + ktime_t now = ktime_get();
>
> ktime_get() is potentially non-trivial. It should be plenty good enough
> to call it only once outside the loop.
Ok.
>> + w = per_cpu(wakeups[irq], cpu);
>> +
>> + /*
>> + * The interrupt was not setup as a source of a wakeup
>> + * or the wakeup source is not considered at this
>> + * moment stable enough to do a prediction.
>> + */
>> + if (!w)
>> + continue;
>> +
>> + /*
>> + * No statistics available yet.
>> + */
>> + if (ktime_equal(w->timestamp, ktime_set(0, 0)))
>> + continue;
>> +
>> + diff = ktime_sub(now, w->timestamp);
>> +
>> + /*
>> + * There is no point attempting predictions on interrupts more
>> + * than 1 second apart. This has no benefit for sleep state
>> + * selection and increases the risk of overflowing our variance
>> + * computation. Reset all stats in that case.
>> + */
>> + if (unlikely(ktime_after(diff, ktime_set(1, 0)))) {
>> + stats_reset(&w->stats);
>> + continue;
>> + }
>
> The above is wrong. It is not computing the interval between successive
> interruts but rather the interval between the last interrupt occurrence
> and the present time (i.e. when we're about to go idle). This won't
> prevent interrupt intervals greater than one second from being summed
> and potentially overflowing the variance if this code is executed less
> than a second after one such IRQ interval. This test should rather be
> performed in sched_idle_irq().
Ok, I will move it.
>> + * If the mean value is null, just ignore this wakeup
>> + * source.
>> + */
>> + mean = stats_mean(&w->stats);
>> + if (!mean)
>> + continue;
>> +
>> + variance = stats_variance(&w->stats, mean);
>> + /*
>> + * We want to check the last interval is:
>> + *
>> + * mean - stddev < interval < mean + stddev
>> + *
>> + * That simplifies to:
>> + *
>> + * -stddev < interval - mean < stddev
>> + *
>> + * abs(interval - mean) < stddev
>> + *
>> + * The standard deviation is the sqrt of the variance:
>> + *
>> + * abs(interval - mean) < sqrt(variance)
>> + *
>> + * and we want to prevent to do an sqrt, so we square
>> + * the equation:
>> + *
>> + * (interval - mean)^2 < variance
>> + *
>> + * So if the latest value of the stats complies with
>> + * this condition, then the wakeup source is
>> + * considered predictable and can be used to predict
>> + * the next event.
>> + */
>> + interval = w->stats.values[(w->stats.w_ptr + 1) % STATS_NR_VALUES];
>
> But here (w->stats.w_ptr + 1) % STATS_NR_VALUES does not point at the
> last interval. It rather points at the second oldest.
> To make things simpler, you should rather pre-increment the pointer
> before updating the array in stats_add(), and here the value you want
> will directly be accessible with w->stats.values[w->stats.w_ptr].
Yeah, there is a glitch here. I will look at it.
>> + if ((u64)((interval - mean) * (interval - mean)) > variance)
>> + continue;
>
> Same comment as in stats_variance(): this should be s64.
>
>> + /*
>> + * Let's compute the next event: the wakeup source is
>> + * considered predictable, we add the average interval
>> + * time added to the latest interruption event time.
>> + */
>> + next = ktime_add_us(w->timestamp, stats_mean(&w->stats));
>> +
>> + /*
>> + * If the interrupt is supposed to happen before the
>> + * minimum time, then it becomes the minimum.
>> + */
>> + if (ktime_before(next, min))
>> + min = next;
>> + }
>> +
>> + /*
>> + * At this point, we have our prediction but the caller is
>> + * expecting the remaining time before the next event, so
>> + * compute the expected sleep length.
>> + */
>> + diff = ktime_sub(min, ktime_get());
>
> You should use the variable 'now' rather than asking for the current
> time again.
Yep.
>> + /*
>> + * The result could be negative for different reasons:
>> + * - the prediction is incorrect
>> + * - the prediction was too near now and expired while we were
>> + * in this function
>> + *
>> + * In both cases, we return KTIME_MAX as a failure to do a
>> + * prediction
>> + */
>> + if (ktime_compare(diff, ktime_set(0, 0)) <= 0)
>> + return (ktime_t){ .tv64 = KTIME_MAX };
>
> See my comment about ktime_t internals at the beginning of this
> function.
Ok.
Thanks for the review.
-- Daniel
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists