linux-kernel - Re: [PATCH v2 8/8] sched/fair: use utilization to select misfit task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <61b1cabc-13ad-6ee0-df0c-72f3b11d368d@arm.com>
Date:   Fri, 2 Aug 2019 11:49:06 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Phil Auld <pauld@...hat.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Quentin Perret <quentin.perret@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Qais Yousef <qais.yousef@....com>
Subject: Re: [PATCH v2 8/8] sched/fair: use utilization to select misfit task

On 02/08/2019 09:29, Vincent Guittot wrote:
>> However what would be *even* better IMO would be:
>>
>> -----8<-----
>> @@ -8853,6 +8853,7 @@ voluntary_active_balance(struct lb_env *env)
>>                         return 1;
>>         }
>>
>> +       /* XXX: make sure current is still a misfit task? */
>>         if (env->balance_type == migrate_misfit)
>>                 return 1;
>>
>> @@ -8966,6 +8967,20 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>>         env.src_rq = busiest;
>>
>>         ld_moved = 0;
>> +
>> +       /*
>> +        * Misfit tasks are only misfit if they are currently running, see
>> +        * update_misfit_status().
>> +        *
>> +        * - If they're not running, we'll get an opportunity at wakeup to
>> +        *   migrate them to a bigger CPU.
>> +        * - If they're running, we need to active balance them to a bigger CPU.
>> +        *
>> +        * Do the smart thing and go straight for active balance.
>> +        */
>> +       if (env->balance_type == migrate_misfit)
>> +               goto active_balance;
>> +
> 
> This looks ugly and add a new bypass which this patchset tries to remove
> This doesn't work if your misfit task has been preempted by another
> one during the load balance and waiting for the runqueue

I won't comment on aesthetics, but when it comes to preempted misfit tasks
do consider that a task *has* to have a utilization >= 80% of its CPU's
capacity to be flagged as misfit.
If it's a busy loop, it can only be preempted ~20% of the time to still be
flagged as a misfit task, so going straight for active balance will be the
right thing to do in the majority of cases.

What we gain from doing that is we save ourselves from (potentially
needlessly) iterating over the rq's tasks. That's less work and less rq
lock contention.

To put a bit of contrast on this, Qais did some profiling of usual mobile
workloads on a regular 4+4 big.LITTLE smartphone and IIRC the rq depth rose
very rarely above 5, although the tail did reach ~100 tasks. So most of the
time it would be fine to go through the detach_tasks() path.

This all deserves some actual benchmarking, I'll give it a shot.

>>         if (busiest->nr_running > 1) {
>>                 /*
>>                  * Attempt to move tasks. If find_busiest_group has found
>> @@ -9074,7 +9089,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>>                         goto out_all_pinned;
>>                 }
>>         }
>> -
>> +active_balance:
>>         if (!ld_moved) {
>>                 schedstat_inc(sd->lb_failed[idle]);
>>                 /*
>> ----->8-----