[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <000001d545e3$047d9750$0d78c5f0$@net>
Date: Mon, 29 Jul 2019 00:55:37 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Viresh Kumar'" <viresh.kumar@...aro.org>
Cc: "'Rafael J. Wysocki'" <rafael@...nel.org>,
"'Rafael Wysocki'" <rjw@...ysocki.net>,
"'Ingo Molnar'" <mingo@...hat.com>,
"'Peter Zijlstra'" <peterz@...radead.org>,
"'Linux PM'" <linux-pm@...r.kernel.org>,
"'Vincent Guittot'" <vincent.guittot@...aro.org>,
"'Joel Fernandes'" <joel@...lfernandes.org>,
"'v4 . 18+'" <stable@...r.kernel.org>,
"'Linux Kernel Mailing List'" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] cpufreq: schedutil: Don't skip freq update when limits change
On 2019.07.25 23:58 Viresh Kumar wrote:
> On 25-07-19, 08:20, Doug Smythies wrote:
>> I tried the patch ("patch2"). It did not fix the issue.
>>
>> To summarize, all kernel 5.2 based, all intel_cpufreq driver and schedutil governor:
>>
>> Test: Does a busy system respond to maximum CPU clock frequency reduction?
>>
>> stock, unaltered: No.
>> revert ecd2884291261e3fddbc7651ee11a20d596bb514: Yes
>> viresh patch: No.
>> fast_switch edit: No.
>> viresh patch2: No.
>
> Hmm, so I tried to reproduce your setup on my ARM board.
> - booted only with CPU0 so I hit the sugov_update_single() routine
> - And applied below diff to make CPU look permanently busy:
>
> -------------------------8<-------------------------
>diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 2f382b0959e5..afb47490e5dc 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -121,6 +121,7 @@ static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
> if (!sugov_update_next_freq(sg_policy, time, next_freq))
> return;
>
> + pr_info("%s: %d: %u\n", __func__, __LINE__, freq);
?? there is no "freq" variable here, and so this doesn't compile. However this works:
+ pr_info("%s: %d: %u\n", __func__, __LINE__, next_freq);
> next_freq = cpufreq_driver_fast_switch(policy, next_freq);
> if (!next_freq)
> return;
> @@ -424,14 +425,10 @@ static unsigned long sugov_iowait_apply(struct sugov_cpu *sg_cpu, u64 time,
> #ifdef CONFIG_NO_HZ_COMMON
> static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu)
> {
> - unsigned long idle_calls = tick_nohz_get_idle_calls_cpu(sg_cpu->cpu);
> - bool ret = idle_calls == sg_cpu->saved_idle_calls;
> -
> - sg_cpu->saved_idle_calls = idle_calls;
> - return ret;
> + return true;
> }
> #else
> -static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
> +static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return true; }
> #endif /* CONFIG_NO_HZ_COMMON */
>
> /*
> @@ -565,6 +562,7 @@ static void sugov_work(struct kthread_work *work)
> sg_policy->work_in_progress = false;
> raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
>
> + pr_info("%s: %d: %u\n", __func__, __LINE__, freq);
> mutex_lock(&sg_policy->work_lock);
> __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
> mutex_unlock(&sg_policy->work_lock);
>
> -------------------------8<-------------------------
>
> Now, the frequency never gets down and so gets set to the maximum
> possible after a bit.
>
> - Then I did:
>
> echo <any-low-freq-value> > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
>
> Without my patch applied:
> The print never gets printed and so frequency doesn't go down.
>
> With my patch applied:
> The print gets printed immediately from sugov_work() and so
> the frequency reduces.
>
> Can you try with this diff along with my Patch2 ? I suspect there may
> be something wrong with the intel_cpufreq driver as the patch fixes
> the only path we have in the schedutil governor which takes busyness
> of a CPU into account.
With this diff along with your patch2 There is never a print message
from sugov_work. There are from sugov_fast_switch.
Note that for the intel_cpufreq CPU scaling driver and the schedutil
governor I adjust the maximum clock frequency this way:
echo <any-low-percent> > /sys/devices/system/cpu/intel_pstate/max_perf_pct
I also applied the pr_info messages to the reverted kernel, and re-did
my tests (where everything works as expected). There is never a print
message from sugov_work. There are from sugov_fast_switch.
Notes:
I do not know if:
/sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq
/sys/devices/system/cpu/cpufreq/policy*/scaling_min_freq
Need to be accurate when using the intel_pstate driver in passive mode.
They are not.
The commit comment for 9083e4986124389e2a7c0ffca95630a4983887f0
suggests that they might need to be representative.
I wonder if something similar to that commit is needed
for other global changes, such as max_perf_pct and min_perf_pct?
intel_cpufreq/ondemand doesn't work properly on the reverted kernel.
(just discovered, not investigated)
I don't know about other governors.
... Doug
Powered by blists - more mailing lists