linux-kernel - Re: [PATCH] cpufreq: schedutil: rate limits for SCHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <197c26ba-c2a6-2de7-dffa-5b884079f746@evidence.eu.com>
Date:   Fri, 9 Feb 2018 09:02:34 +0100
From:   Claudio Scordino <claudio@...dence.eu.com>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Patrick Bellasi <patrick.bellasi@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Todd Kjos <tkjos@...roid.com>,
        Joel Fernandes <joelaf@...gle.com>, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpufreq: schedutil: rate limits for SCHED_DEADLINE

Hi Viresh,

Il 09/02/2018 04:51, Viresh Kumar ha scritto:
> On 08-02-18, 18:01, Claudio Scordino wrote:
>> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> we should not wait for the rate limit, otherwise we may miss some deadline.
>>
>> Tests using rt-app on Exynos5422 have shown reductions of about 10% of deadline
>> misses for tasks with low RT periods.
>>
>> The patch applies on top of the one recently proposed by Peter to drop the
>> SCHED_CPUFREQ_* flags.
>>
>> Signed-off-by: Claudio Scordino <claudio@...dence.eu.com>
>> CC: Rafael J . Wysocki <rafael.j.wysocki@...el.com>
>> CC: Patrick Bellasi <patrick.bellasi@....com>
>> CC: Dietmar Eggemann <dietmar.eggemann@....com>
>> CC: Morten Rasmussen <morten.rasmussen@....com>
>> CC: Juri Lelli <juri.lelli@...hat.com>
>> CC: Viresh Kumar <viresh.kumar@...aro.org>
>> CC: Vincent Guittot <vincent.guittot@...aro.org>
>> CC: Todd Kjos <tkjos@...roid.com>
>> CC: Joel Fernandes <joelaf@...gle.com>
>> CC: linux-pm@...r.kernel.org
>> CC: linux-kernel@...r.kernel.org
>> ---
>>   kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
>>   1 file changed, 12 insertions(+), 3 deletions(-)
> 
> So the previous commit was surely incorrect as it relied on comparing
> frequencies instead of dl-util, and freq requirements could have even
> changed due to CFS.

You're right.
The very original patch (not posted) added a specific SCHED_CPUFREQ flag to let the scheduling class ask for ignoring the rate limit.
However, polluting the API with further flags is not such a good approach.
The next patches didn't introduce such flag, but were incorrect.

> 
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index b0bd77d..d8dcba2 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>>   
>>   /************************ Governor internals ***********************/
>>   
>> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
>> +				     u64 time,
>> +				     struct sugov_cpu *sg_cpu_old,
>> +				     struct sugov_cpu *sg_cpu_new)
>>   {
>>   	s64 delta_ns;
>>   
>> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>>   		return true;
>>   	}
>>   
>> +	/* Ignore rate limit when DL increased utilization. */
>> +	if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
>> +		return true;
>> +
> 
> Changing the frequency has a penalty, specially in the ARM world (and
> that's where you are testing your stuff). I am worried that we will
> have (corner) cases where we will waste a lot of time changing the
> frequencies. For example (I may be wrong here), what if 10 small DL
> tasks are queued one after the other? The util will keep on changing
> and so will the frequency ? There may be more similar cases ?

I forgot to say that I've not observed any relevant increase of the energy consumption (measured through a Baylibre Cape).
However, the tests had a very small number of RT tasks.

If I'm not wrong, at the hardware level we do have a physical rate limit (as we cannot trigger a frequency update when there is one already on-going).
Don't know if this could somehow mitigate such effect.

Anyway, I'll repeat the tests with a considerable amount of RT tasks to check if I can reproduce such "ramp up" situation.
Depending on the energy results, we may have to choose between meeting more RT deadlines and consuming less energy.

> 
> Is it possible to (somehow) check here if the DL tasks will miss
> deadline if we continue to run at current frequency? And only ignore
> rate-limit if that is the case ?

I need to think further about it.

> 
>>   	delta_ns = time - sg_policy->last_freq_update_time;
>>   	return delta_ns >= sg_policy->freq_update_delay_ns;
>>   }
>> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>>   				unsigned int flags)
>>   {
>>   	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
>> +	struct sugov_cpu sg_cpu_old = *sg_cpu;
> 
> Not really a big deal, but this structure is 80 bytes on ARM64, why
> copy everything when what we need is just 8 bytes ?

I didn't want to add deadline-specific code into the sugov_should_update_freq() signature as it should remain independent from the scheduling classes.
In my opinion, the best approach would be to group util_cfs and util_dl in a struct within sugov_cpu and pass that struct to sugov_should_update_freq().

Thanks for your comments.

                Claudio