[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c79ac631-61c7-4953-a657-74047a264029@efficios.com>
Date: Mon, 23 Oct 2023 11:04:49 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Swapnil Sapkal <Swapnil.Sapkal@....com>,
Aaron Lu <aaron.lu@...el.com>, Chen Yu <yu.c.chen@...el.com>,
Tim Chen <tim.c.chen@...el.com>,
K Prateek Nayak <kprateek.nayak@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>, x86@...nel.org
Subject: Re: [RFC PATCH v2 1/2] sched/fair: Introduce UTIL_FITS_CAPACITY
feature (v2)
On 2023-10-23 10:11, Dietmar Eggemann wrote:
> On 19/10/2023 18:05, Mathieu Desnoyers wrote:
[...]
>>
>> +static unsigned long scale_rt_capacity(int cpu);
>> +
>> +/*
>> + * Returns true if adding the task utilization to the estimated
>> + * utilization of the runnable tasks on @cpu does not exceed the
>> + * capacity of @cpu.
>> + *
>> + * This considers only the utilization of _runnable_ tasks on the @cpu
>> + * runqueue, excluding blocked and sleeping tasks. This is achieved by
>> + * using the runqueue util_est.enqueued.
>> + */
>> +static inline bool task_fits_remaining_cpu_capacity(unsigned long task_util,
>> + int cpu)
>
> This is almost like the existing task_fits_cpu(p, cpu) (used in Capacity
> Aware Scheduling (CAS) for Asymmetric CPU capacity systems) except the
> latter only uses `util = task_util_est(p)` and deals with uclamp as well
> and only tests whether p could fit on the CPU.
This is indeed a major difference between how asym capacity check works
and what is introduced here:
asym capacity check only checks whether the given task theoretically
fits in the cpu if that cpu was completely idle, without considering the
current cpu utilization.
My approach is to consider the current util_est of the cpu to check
whether the task fits in the remaining capacity.
I did not want to use the existing task_fits_cpu() helper because the
notions of uclamp bounds appear to be heavily tied to the fact that it
checks whether the task fits in an _idle_ runqueue, whereas the check I
am introducing here is much more restrictive: it checks that the task
fits on the runqueue within the remaining capacity.
>
> Or like find_energy_efficient_cpu() (feec(), used in
> Energy-Aware-Scheduling (EAS)) which uses cpu_util(cpu, p, cpu, 0) to get:
>
> max(util_avg(CPU + p), util_est(CPU + p))
I've tried using cpu_util(), but unfortunately anything that considers
blocked/sleeping tasks in its utilization total does not work for my
use-case.
From cpu_util():
* CPU utilization is the sum of running time of runnable tasks plus the
* recent utilization of currently non-runnable tasks on that CPU.
>
> feec()
> ...
> for (; pd; pd = pd->next)
> ...
> util = cpu_util(cpu, p, cpu, 0);
> ...
> fits = util_fits_cpu(util, util_min, util_max, cpu)
> ^^^^^^^^^^^^^^^^^^
> not used when uclamp is not active (1)
> ...
> capacity = capacity_of(cpu)
> fits = fits_capacity(util, capacity)
> if (!uclamp_is_used()) (1)
> return fits
>
> So not introducing new functions like task_fits_remaining_cpu_capacity()
> in this area and using existing one would be good.
If the notion of uclamp is not tied to the way asym capacity check is
done against a theoretically idle runqueue, I'd be OK with using this,
but so far both appear to be very much tied.
When I stumbled on this fundamental difference between asym cpu capacity
check and the check introduced here, I've started wondering whether the
asym cpu capacity check would benefit from considering the target cpu
current utilization as well.
>
>> +{
>> + unsigned long total_util;
>> +
>> + if (!sched_util_fits_capacity_active())
>> + return false;
>> + total_util = READ_ONCE(cpu_rq(cpu)->cfs.avg.util_est.enqueued) + task_util;
>> + return fits_capacity(total_util, scale_rt_capacity(cpu));
>
> Why not use:
>
> static unsigned long capacity_of(int cpu)
> return cpu_rq(cpu)->cpu_capacity;
>
> which is maintained in update_cpu_capacity() as scale_rt_capacity(cpu)?
The reason for preferring scale_rt_capacity(cpu) over capacity_of(cpu)
is that update_cpu_capacity() only runs periodically every
balance-interval, therefore providing a coarse-grained remaining
capacity approximation with respect to irq, rt, dl, and thermal utilization.
If it turns out that being coarse-grained is good enough, we may be able
to save some cycles by using capacity_of(), but not without carefully
considering the impacts of being imprecise.
>
> [...]
>
>> @@ -7173,7 +7200,8 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>> if (recent_used_cpu != prev &&
>> recent_used_cpu != target &&
>> cpus_share_cache(recent_used_cpu, target) &&
>> - (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
>> + (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu) ||
>> + task_fits_remaining_cpu_capacity(task_util, recent_used_cpu)) &&
>> cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) &&
>> asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
>> return recent_used_cpu;
>> diff --git a/kernel/sched/features.h b/kernel/sched/features.h
>> index ee7f23c76bd3..9a84a1401123 100644
>> --- a/kernel/sched/features.h
>> +++ b/kernel/sched/features.h
>> @@ -97,6 +97,12 @@ SCHED_FEAT(WA_BIAS, true)
>> SCHED_FEAT(UTIL_EST, true)
>> SCHED_FEAT(UTIL_EST_FASTUP, true)
>
> IMHO, asymmetric CPU capacity systems would have to disable the sched
> feature UTIL_FITS_CAPACITY. Otherwise CAS could deliver different
> results. task_fits_remaining_cpu_capacity() and asym_fits_cpu() work
> slightly different.
I don't think they should be mutually exclusive. We should look into the
differences between those two more closely to make them work nicely
together instead. For instance, why does asym capacity only consider
whether tasks fit in a theoretically idle runqueue, when it could use
the current utilization of the runqueue to check that the task fits in
the remaining capacity ?
Unfortunately I don't have a machine with asym cpu to test locally.
Thanks for your feedback !
Mathieu
>
> [...]
>
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists