lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <24b07842-1770-13b9-8182-8dcf4f0a28fa@linux.ibm.com>
Date:   Wed, 9 Oct 2019 22:25:23 +0530
From:   Parth Shah <parth@...ux.ibm.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>,
        Hillf Danton <hdanton@...a.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Patrick Bellasi <patrick.bellasi@...bug.net>,
        Valentin Schneider <valentin.schneider@....com>,
        Pavel Machek <pavel@....cz>,
        Doug Smythies <dsmythies@...us.net>,
        Quentin Perret <qperret@...rret.net>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [RFC v5 4/6] sched/fair: Tune task wake-up logic to pack small
 background tasks on fewer cores



On 10/9/19 5:04 PM, Vincent Guittot wrote:
> On Wed, 9 Oct 2019 at 11:23, Parth Shah <parth@...ux.ibm.com> wrote:
>>
>>
>>
>> On 10/8/19 6:58 PM, Hillf Danton wrote:
>>>
>>> On Mon,  7 Oct 2019 14:00:49 +0530 Parth Shah wrote:
>>>> +/*
>>>> + * Try to find a non idle core in the system  based on few heuristics:
>>>> + * - Keep track of overutilized (>80% util) and busy (>12.5% util) CPUs
>>>> + * - If none CPUs are busy then do not select the core for task packing
>>>> + * - If atleast one CPU is busy then do task packing unless overutilized CPUs
>>>> + *   count is < busy/2 CPU count
>>>> + * - Always select idle CPU for task packing
>>>> + */
>>>> +static int select_non_idle_core(struct task_struct *p, int prev_cpu, int target)
>>>> +{
>>>> +    struct cpumask *cpus = this_cpu_cpumask_var_ptr(turbo_sched_mask);
>>>> +    int iter_cpu, sibling;
>>>> +
>>>> +    cpumask_and(cpus, cpu_online_mask, p->cpus_ptr);
>>>> +
>>>> +    for_each_cpu_wrap(iter_cpu, cpus, prev_cpu) {
>>>> +            int idle_cpu_count = 0, non_idle_cpu_count = 0;
>>>> +            int overutil_cpu_count = 0;
>>>> +            int busy_cpu_count = 0;
>>>> +            int best_cpu = iter_cpu;
>>>> +
>>>> +            for_each_cpu(sibling, cpu_smt_mask(iter_cpu)) {
>>>> +                    __cpumask_clear_cpu(sibling, cpus);
>>>> +                    if (idle_cpu(iter_cpu)) {
>>>
>>> Would you please elaborate the reasons that the iter cpu is checked idle
>>> more than once for finding a busy core?
>>>
>>
>> Thanks for looking at the patches.
>> Could you please point me out where iter_cpu is checked more than once?
> 
> I think that point is that you have a sibling that there is
> for_each_cpu(sibling, cpu_smt_mask(iter_cpu) but you never use sibling
> in the loop except for clearing it on the cpumask cpus
> All the tests are done with iter_cpu so you will test several time
> iter_cpus but never the other sibling
> Should you use sibling instead ?
> 

oh got it. it was unintentional here, my bad.
good find

I did s/iter_cpu/sibling/ at required places:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4a1b6474338..a75c2b382771 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6001,14 +6001,14 @@ static int select_non_idle_core(struct task_struct
*p, int prev_cpu, int target)

                for_each_cpu(sibling, cpu_smt_mask(iter_cpu)) {
                        __cpumask_clear_cpu(sibling, cpus);
-                       if (idle_cpu(iter_cpu)) {
+                       if (idle_cpu(sibling)) {
                                idle_cpu_count++;
-                               best_cpu = iter_cpu;
+                               best_cpu = sibling;
                        } else {
                                non_idle_cpu_count++;
-                               if (cpu_overutilized(iter_cpu))
+                               if (cpu_overutilized(sibling))
                                        overutil_cpu_count++;
-                               if (is_cpu_busy(cpu_util(iter_cpu)))
+                               if (is_cpu_busy(cpu_util(sibling)))
                                        busy_cpu_count++;
                        }
                }

and the took the results again to see functionality changes.
Results are still within the bounds with maximum of 15% gain in performance
and <2% of regression.

		Frequency benefit of TurboSched w.r.t. CFS
   +-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+
20 +-+ + + +  + + + + + + + +  + + + + + + + +  + + + + + + + +  + + + +-+
   |                                      Frequency benefit in %         |
   |                    **                                               |
15 +-+        * * **    ******                                         +-+
   |          * * ************                                           |
   |       ** * * ************ *                                         |
10 +-+     ** * * ************ *                                       +-+
   |       ** * * ************ * * ****                                  |
   |     **** * * ************ * * ****                                  |
 5 +-+ ****** * * ************ * * ******                              +-+
   |   ****** * * ************ * * **********                            |
 0 +-******** * * ************ * * ************ * * * ********** * * * **+
   |                                                                     |
   |                                                                     |
-5 +-+                                                                 +-+
   | + + + +  + + + + + + + +  + + + + + + + +  + + + + + + + +  + + + + |
   +-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+
     0 1 2 3  4 5 6 7 8 91011 1213141516171819 2021222324252627 28293031
                           No. of workload threads

		 Performance benefit of TurboSched w.r.t. CFS
20 +-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+
   | + + + +  + + + + + + + +  + + + + + + + +  + + + + + + + +  + + + + |
   |                                    Performance benefit in %         |
15 +-+                    **                                           +-+
   |                      **                                             |
   |                  ******** *                                         |
10 +-+                ******** *   **                                  +-+
   |                  ******** * * **                                    |
   |                  ******** * * **                                    |
 5 +-+              ********** * * ******                              +-+
   |                ********** * * **********                            |
   |              ************ * * **********   *       **               |
 0 +-******** * * ************ * * ************ * * * ********** * * * **+
   | ******** *                                                          |
   |                                                                     |
-5 +-+                                                                 +-+
   | + + + +  + + + + + + + +  + + + + + + + +  + + + + + + + +  + + + + |
   +-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+-+-+-+--+-+-+-+-+
     0 1 2 3  4 5 6 7 8 91011 1213141516171819 2021222324252627 28293031
                           No. of workload threads

Thanks,
Parth

> 
>>
>>>> +                            idle_cpu_count++;
>>>> +                            best_cpu = iter_cpu;
>>>> +                    } else {
>>>> +                            non_idle_cpu_count++;
>>>> +                            if (cpu_overutilized(iter_cpu))
>>>> +                                    overutil_cpu_count++;
>>>> +                            if (is_cpu_busy(cpu_util(iter_cpu)))
>>>> +                                    busy_cpu_count++;
>>>> +                    }
>>>> +            }
>>>> +
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ