linux-kernel - Re: [PATCH] sched: rt: Make RT capacity aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3e054f2e-75f5-ed87-8640-766828a2fbfb@arm.com>
Date:   Mon, 7 Oct 2019 11:14:11 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Qais Yousef <qais.yousef@....com>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Alessio Balsini <balsini@...roid.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: rt: Make RT capacity aware

On 23/09/2019 13:52, Qais Yousef wrote:
> On 09/20/19 14:52, Dietmar Eggemann wrote:
>>> 	2. The fallback mechanism means we either have to call cpupri_find()
>>> 	   twice once to find filtered lowest_rq and the other to return the
>>> 	   none filtered version.
>>
>> This is what I have in mind. (Only compile tested! ... and the 'if
>> (cpumask_any(lowest_mask) >= nr_cpu_ids)' condition has to be considered
>> as well):
>>
>> @@ -98,8 +103,26 @@ int cpupri_find(struct cpupri *cp, struct
>> task_struct *p,
>>                         continue;
>>
>>                 if (lowest_mask) {
>> +                       int cpu, max_cap_cpu = -1;
>> +                       unsigned long max_cap = 0;
>> +
>>                         cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
>>
>> +                       for_each_cpu(cpu, lowest_mask) {
>> +                               unsigned long cap =
>> arch_scale_cpu_capacity(cpu);
>> +
>> +                               if (!rt_task_fits_capacity(p, cpu))
>> +                                       cpumask_clear_cpu(cpu, lowest_mask);
>> +
>> +                               if (cap > max_cap) {
>> +                                       max_cap = cap;
>> +                                       max_cap_cpu = cpu;
>> +                               }
>> +                       }
>> +
>> +                       if (cpumask_empty(lowest_mask) && max_cap)
>> +                               cpumask_set_cpu(max_cap_cpu, lowest_mask);
> 
> I had a patch that I was testing but what I did is to continue rather than
> return a max_cap_cpu.

Continuing is the correct thing to do here. I just tried to illustrate
the idea.

> e.g:
> 
> 	if no cpu at current priority fits the task:
> 		continue;
> 	else:
> 		return the lowest_mask which contains fitting cpus only
> 
> 	if no fitting cpu was find:
> 		return 0;

I guess this is what we want to achieve here. It's unavoidable that we
will run sooner (compared to an SMP system) into a situation in which we
have to go higher in the rd->cpupri->pri_to_cpu[] array or in which we
can't return a lower mask at all.

> Or we can tweak your approach to be
> 
> 	if no cpu at current priority fits the task:
> 		if the cpu the task is currently running on doesn't fit it:
> 			return lowest_mask with max_cap_cpu set;

I wasn't aware of the pri_to_cpu[] array and how cpupri_find(,
lowest_mask) tries to return the lowest_mask of the lowest priority
(pri_to_cpu[] index).

> So we either:
> 
> 	1. Continue the search until we find a fitting CPU; bail out otherwise.

If this describes the solution in which we concentrate the
capacity-awareness in cpupri_find(), then I'm OK with it.
find_lowest_rq() already favours task_cpu(task), this_cpu and finally
cpus in sched_groups (from the viewpoint of task_cpu(task)).

> 	2. Or we attempt to return a CPU only if the CPU the task is currently
> 	   running on doesn't fit it. We don't want to migrate the task from a
> 	   fitting to a non-fitting one.

I would prefer 1., keeping the necessary changes confined in
cpupri_find() if possible.

> We can also do something hybrid like:
> 
> 	3. Remember the outcome of 2 but don't return immediately and attempt
> 	   to find a fitting CPU at a different priority level.
> 
> 
> Personally I see 1 is the simplest and good enough solution. What do you think?

Agreed. We would potentially need a fast lookup for p -> uclamp_cpumask
though?

> I think this is 'continue' to search makes doing it at cpupri_find() more
> robust than having to deal with whatever mask we first found in
> find_lowest_rq() - so I'm starting to like this approach better. Thanks for
> bringing it up.

My main concern is that having rt_task_fits_capacity() added to almost
every condition in the code makes it hard to understand what capacity
awareness in RT wants to achieve.

[...]