linux-kernel - Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJWu+ooBRXcND26JDvjvR-gGNWSmnV-adnLfy3HAQn23q_xqAg@mail.gmail.com>
Date:   Mon, 21 Aug 2017 21:34:45 -0700
From:   Joel Fernandes <joelaf@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Brendan Jackman <brendan.jackman@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andres Oportus <andresoportus@...gle.com>,
        Ingo Molnar <mingo@...hat.com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group

Hi Peter,

On Mon, Aug 21, 2017 at 2:14 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, Aug 21, 2017 at 04:21:28PM +0100, Brendan Jackman wrote:
>> The current use of returning NULL from find_idlest_group is broken in
>> two cases:
>>
>> a1) The local group is not allowed.
>>
>>    In this case, we currently do not change this_runnable_load or
>>    this_avg_load from its initial value of 0, which means we return
>>    NULL regardless of the load of the other, allowed groups. This
>>    results in pointlessly continuing the find_idlest_group search
>>    within the local group and then returning prev_cpu from
>>    select_task_rq_fair.
>
>> b) smp_processor_id() is the "idlest" and != prev_cpu.
>>
>>    find_idlest_group also returns NULL when the local group is
>>    allowed and is the idlest. The caller then continues the
>>    find_idlest_group search at a lower level of the current CPU's
>>    sched_domain hierarchy. However new_cpu is not updated. This means
>>    the search is pointless and we return prev_cpu from
>>    select_task_rq_fair.
>>
>
> I think its much simpler than that.. but its late, so who knows ;-)
>
> Both cases seem predicated on the assumption that we'll return @cpu when
> we don't find any idler CPU. Consider, if the local group is the idlest,
> we should stick with @cpu and simply proceed with the child domain.
>
> The confusion, and the bugs, seem to have snuck in when we started
> considering @prev_cpu, whenever that was. The below is mostly code
> movement to put that whole while(sd) loop into its own function.
>
> The effective change is setting @new_cpu = @cpu when we start that loop:
>
<snip>
> ---
>  kernel/sched/fair.c | 83 +++++++++++++++++++++++++++++++----------------------
>  1 file changed, 48 insertions(+), 35 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c77e4b1d51c0..3e77265c480a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5588,10 +5588,10 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>  }
>
>  /*
> - * find_idlest_cpu - find the idlest cpu among the cpus in group.
> + * find_idlest_group_cpu - find the idlest cpu among the cpus in group.
>   */
>  static int
> -find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
> +find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>  {
>         unsigned long load, min_load = ULONG_MAX;
>         unsigned int min_exit_latency = UINT_MAX;
> @@ -5640,6 +5640,50 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>         return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
>  }
>
> +static int
> +find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int sd_flag)
> +{
> +       struct sched_domain *tmp;
> +       int new_cpu = cpu;
> +
> +       while (sd) {
> +               struct sched_group *group;
> +               int weight;
> +
> +               if (!(sd->flags & sd_flag)) {
> +                       sd = sd->child;
> +                       continue;
> +               }
> +
> +               group = find_idlest_group(sd, p, cpu, sd_flag);
> +               if (!group) {
> +                       sd = sd->child;
> +                       continue;

But this will still have the issue of pointlessly searching in the
local_group when the idlest CPU is in the non-local group? Stemming
from the fact that find_idlest_group is broken if the local group is
not allowed.

I believe this is fixed by Brendan's patch? :

"Initializing this_runnable_load and this_avg_load to ULONG_MAX
   instead of 0. This means in case a1) we now return the idlest
   non-local group."


Hopefully I didn't missing something. Sorry if I did, thanks,

-Joel