lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151123175643.GA10703@e104805>
Date:	Mon, 23 Nov 2015 17:56:44 +0000
From:	Javi Merino <javi.merino@....com>
To:	Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	John Stultz <john.stultz@...aro.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
	Len Brown <len.brown@...el.com>,
	Rafael Wysocki <rafael.j.wysocki@...el.com>,
	Eduardo Valentin <edubezval@...il.com>,
	Paul Turner <pjt@...gle.com>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	Juri Lelli <Juri.Lelli@....com>
Subject: Re: [PATCH 3/4] sched: introduce synchronized idle injection

On Fri, Nov 13, 2015 at 11:53:06AM -0800, Jacob Pan wrote:
> With increasingly constrained power and thermal budget, it's often
> necessary to cap power via throttling. Throttling individual CPUs
> or devices at random times can help power capping but may not be
> optimal in terms of energy efficiency. Frequency scaling is also
> limited by certain range before losing energy efficiency.
> 
> In general, the optimal solution in terms of energy efficiency is
> to align idle periods such that more shared circuits can be power
> gated to enter lower power states. Combined with energy efficient
> frequency point, idle injection provides a way to scale power and
> performance efficiently.
> 
> This patch introduces a scheduler based idle injection method, it
> works by blocking CFS runqueue synchronously and periodically. The
> actions on all online CPUs are orchestrated by per CPU hrtimers.
> 
> Two sysctl knobs are given to the userspace for selecting the
> percentage of idle time as well as the forced idle duration for each
> idle period injected.
> 
> Since only CFS class is targeted, other high priority tasks are not
> affected, such as EDF and RT tasks as well as softirq and interrupts.
> 
> Hotpath in CFS pick_next_task is optimized by Peter Zijlstra, where
> a new runnable flag is introduced to combine forced idle and
> nr_running.
> 
> Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> ---
>  include/linux/sched.h        |  11 ++
>  include/linux/sched/sysctl.h |   5 +
>  init/Kconfig                 |  10 ++
>  kernel/sched/fair.c          | 353 ++++++++++++++++++++++++++++++++++++++++++-
>  kernel/sched/sched.h         |  54 ++++++-
>  kernel/sysctl.c              |  21 +++
>  6 files changed, 449 insertions(+), 5 deletions(-)

I've tested this series on Juno (2xCortex-A57 4xCortex-A53).  If you
idle inject for 50% of the time, when I run 6 busy loops the scheduler
sometimes keeps two of them in the same cpu while the another cpu is
completely idle.  Without idle injection the scheduler does the
sensible thing: put one busy loop in each CPU.  I'm running systemd
and this only happens with CONFIG_SCHED_AUTOGROUP=y.  If I unset
CONFIG_SCHED_AUTOGROUP, the tasks are spread across all cpus as usual.

See below part of the trace that shows this problem.  CPU3 has two
100% tasks: 1554 and 1549 but the scheduler never moves one of the
tasks to CPU4, which has an empty runqueue.  Both cpus are in the same
domain.  Juri helped me add two additional trace points to track the
load of a task and cpu.  This tracepoints are added at the end of
update_load_avg().

          <idle>-0     [002]   164.739796: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [000]   164.739797: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [005]   164.739797: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [001]   164.739797: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [003]   164.739797: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [004]   164.739798: sched_cfs_idle_inject_timer: throttled=0
          <idle>-0     [002]   164.739802: sched_load_avg_cpu:   cpu=2 load_avg=171 util_avg=406
          <idle>-0     [002]   164.739803: sched_load_avg_task:  comm=busy_loop pid=1552 cpu=2 load_avg=1006 util_avg=400 load_sum=48043453 util_sum=19130537 period_contrib=173
          <idle>-0     [001]   164.739803: sched_load_avg_cpu:   cpu=1 load_avg=170 util_avg=405
          <idle>-0     [002]   164.739804: sched_load_avg_cpu:   cpu=2 load_avg=1014 util_avg=403
          <idle>-0     [001]   164.739804: sched_load_avg_task:  comm=busy_loop pid=1551 cpu=1 load_avg=1008 util_avg=401 load_sum=48161276 util_sum=19177731 period_contrib=288
          <idle>-0     [005]   164.739804: sched_load_avg_cpu:   cpu=5 load_avg=169 util_avg=404
          <idle>-0     [002]   164.739805: sched_switch:         swapper/2:0 [120] R ==> busy_loop:1552 [120]
          <idle>-0     [001]   164.739805: sched_load_avg_cpu:   cpu=1 load_avg=1024 util_avg=407
          <idle>-0     [003]   164.739805: sched_load_avg_cpu:   cpu=3 load_avg=340 util_avg=405
          <idle>-0     [000]   164.739805: sched_load_avg_cpu:   cpu=0 load_avg=168 util_avg=400
          <idle>-0     [001]   164.739806: sched_switch:         swapper/1:0 [120] R ==> busy_loop:1551 [120]
          <idle>-0     [005]   164.739806: sched_load_avg_task:  comm=busy_loop pid=1550 cpu=5 load_avg=1010 util_avg=402 load_sum=48229881 util_sum=19205027 period_contrib=355
          <idle>-0     [003]   164.739807: sched_load_avg_task:  comm=busy_loop pid=1549 cpu=3 load_avg=1012 util_avg=193 load_sum=48316673 util_sum=9247244 period_contrib=441
          <idle>-0     [000]   164.739807: sched_load_avg_task:  comm=busy_loop pid=1553 cpu=0 load_avg=1005 util_avg=400 load_sum=48003551 util_sum=19119112 period_contrib=134
          <idle>-0     [005]   164.739808: sched_load_avg_cpu:   cpu=5 load_avg=1002 util_avg=399
          <idle>-0     [003]   164.739808: sched_load_avg_cpu:   cpu=3 load_avg=2045 util_avg=407
          <idle>-0     [000]   164.739809: sched_load_avg_cpu:   cpu=0 load_avg=1008 util_avg=401
          <idle>-0     [005]   164.739810: sched_switch:         swapper/5:0 [120] R ==> busy_loop:1550 [120]
          <idle>-0     [003]   164.739810: sched_switch:         swapper/3:0 [120] R ==> busy_loop:1549 [120]
          <idle>-0     [000]   164.739811: sched_switch:         swapper/0:0 [120] R ==> busy_loop:1553 [120]
       busy_loop-1552  [002]   164.743793: sched_stat_runtime:   comm=busy_loop pid=1552 runtime=3991560 [ns] vruntime=605432548 [ns]
       busy_loop-1549  [003]   164.743794: sched_stat_runtime:   comm=busy_loop pid=1549 runtime=3990040 [ns] vruntime=382380848 [ns]
       busy_loop-1552  [002]   164.743794: sched_load_avg_task:  comm=busy_loop pid=1552 cpu=2 load_avg=1024 util_avg=456 load_sum=48889883 util_sum=21796057 period_contrib=999
       busy_loop-1553  [000]   164.743794: sched_stat_runtime:   comm=busy_loop pid=1553 runtime=3990180 [ns] vruntime=590391894 [ns]
       busy_loop-1551  [001]   164.743794: sched_stat_runtime:   comm=busy_loop pid=1551 runtime=3992100 [ns] vruntime=272056341 [ns]
       busy_loop-1550  [005]   164.743794: sched_stat_runtime:   comm=busy_loop pid=1550 runtime=3990920 [ns] vruntime=198320034 [ns]
       busy_loop-1552  [002]   164.743795: sched_load_avg_cpu:   cpu=2 load_avg=1010 util_avg=450
       busy_loop-1551  [001]   164.743796: sched_load_avg_task:  comm=busy_loop pid=1551 cpu=1 load_avg=1004 util_avg=447 load_sum=47958941 util_sum=21380913 period_contrib=90
       busy_loop-1549  [003]   164.743796: sched_load_avg_task:  comm=busy_loop pid=1549 cpu=3 load_avg=1007 util_avg=257 load_sum=48112396 util_sum=12285572 period_contrib=241
       busy_loop-1552  [002]   164.743796: sched_load_avg_cpu:   cpu=2 load_avg=170 util_avg=453
       busy_loop-1553  [000]   164.743796: sched_load_avg_task:  comm=busy_loop pid=1553 cpu=0 load_avg=1023 util_avg=456 load_sum=48847931 util_sum=21780791 period_contrib=958
       busy_loop-1551  [001]   164.743796: sched_load_avg_cpu:   cpu=1 load_avg=1020 util_avg=454
       busy_loop-1550  [005]   164.743797: sched_load_avg_task:  comm=busy_loop pid=1550 cpu=5 load_avg=1005 util_avg=448 load_sum=48026522 util_sum=21410614 period_contrib=156
       busy_loop-1549  [003]   164.743797: sched_load_avg_cpu:   cpu=3 load_avg=2036 util_avg=454
       busy_loop-1553  [000]   164.743798: sched_load_avg_cpu:   cpu=0 load_avg=1004 util_avg=447
       busy_loop-1551  [001]   164.743798: sched_load_avg_cpu:   cpu=1 load_avg=169 util_avg=452
       busy_loop-1550  [005]   164.743798: sched_load_avg_cpu:   cpu=5 load_avg=1020 util_avg=455
       busy_loop-1553  [000]   164.743800: sched_load_avg_cpu:   cpu=0 load_avg=171 util_avg=456
       busy_loop-1549  [003]   164.743800: sched_load_avg_cpu:   cpu=3 load_avg=339 util_avg=452
       busy_loop-1550  [005]   164.743800: sched_load_avg_cpu:   cpu=5 load_avg=168 util_avg=450
       busy_loop-1552  [002]   164.747792: sched_stat_runtime:   comm=busy_loop pid=1552 runtime=3999320 [ns] vruntime=609431868 [ns]
       busy_loop-1553  [000]   164.747793: sched_stat_runtime:   comm=busy_loop pid=1553 runtime=3999380 [ns] vruntime=594391274 [ns]
       busy_loop-1549  [003]   164.747793: sched_stat_runtime:   comm=busy_loop pid=1549 runtime=3999540 [ns] vruntime=386380388 [ns]
       busy_loop-1552  [002]   164.747794: sched_load_avg_task:  comm=busy_loop pid=1552 cpu=2 load_avg=1019 util_avg=499 load_sum=48694671 util_sum=23849523 period_contrib=808
       busy_loop-1551  [001]   164.747794: sched_stat_runtime:   comm=busy_loop pid=1551 runtime=3999880 [ns] vruntime=276056221 [ns]
       busy_loop-1550  [005]   164.747795: sched_stat_runtime:   comm=busy_loop pid=1550 runtime=3999280 [ns] vruntime=202319314 [ns]
       busy_loop-1552  [002]   164.747795: sched_load_avg_cpu:   cpu=2 load_avg=1006 util_avg=492
       busy_loop-1551  [001]   164.747795: sched_load_avg_task:  comm=busy_loop pid=1551 cpu=1 load_avg=1022 util_avg=500 load_sum=48813533 util_sum=23907693 period_contrib=924
       busy_loop-1553  [000]   164.747795: sched_load_avg_task:  comm=busy_loop pid=1553 cpu=0 load_avg=1019 util_avg=499 load_sum=48652717 util_sum=23832040 period_contrib=767
       busy_loop-1549  [003]   164.747796: sched_load_avg_task:  comm=busy_loop pid=1549 cpu=3 load_avg=1003 util_avg=315 load_sum=47917292 util_sum=15063949 period_contrib=50
       busy_loop-1551  [001]   164.747796: sched_load_avg_cpu:   cpu=1 load_avg=1016 util_avg=497
       busy_loop-1552  [002]   164.747796: sched_load_avg_cpu:   cpu=2 load_avg=169 util_avg=496
       busy_loop-1550  [005]   164.747797: sched_load_avg_task:  comm=busy_loop pid=1550 cpu=5 load_avg=1023 util_avg=501 load_sum=48880090 util_sum=23938753 period_contrib=989
       busy_loop-1553  [000]   164.747797: sched_load_avg_cpu:   cpu=0 load_avg=1022 util_avg=500
       busy_loop-1549  [003]   164.747797: sched_load_avg_cpu:   cpu=3 load_avg=2028 util_avg=496
       busy_loop-1551  [001]   164.747797: sched_load_avg_cpu:   cpu=1 load_avg=169 util_avg=495
       busy_loop-1550  [005]   164.747798: sched_load_avg_cpu:   cpu=5 load_avg=1016 util_avg=497
       busy_loop-1553  [000]   164.747799: sched_load_avg_cpu:   cpu=0 load_avg=170 util_avg=499
       busy_loop-1549  [003]   164.747800: sched_load_avg_cpu:   cpu=3 load_avg=337 util_avg=494
       busy_loop-1550  [005]   164.747800: sched_load_avg_cpu:   cpu=5 load_avg=168 util_avg=492
       busy_loop-1552  [002]   164.751792: sched_stat_runtime:   comm=busy_loop pid=1552 runtime=4000260 [ns] vruntime=613432128 [ns]
       busy_loop-1549  [003]   164.751793: sched_stat_runtime:   comm=busy_loop pid=1549 runtime=3999760 [ns] vruntime=390380148 [ns]
       busy_loop-1553  [000]   164.751793: sched_stat_runtime:   comm=busy_loop pid=1553 runtime=3999920 [ns] vruntime=598391194 [ns]
       busy_loop-1552  [002]   164.751793: sched_load_avg_task:  comm=busy_loop pid=1552 cpu=2 load_avg=1015 util_avg=538 load_sum=48500452 util_sum=25717351 period_contrib=618
       busy_loop-1550  [005]   164.751793: sched_stat_runtime:   comm=busy_loop pid=1550 runtime=3999920 [ns] vruntime=206319234 [ns]
       busy_loop-1552  [002]   164.751794: sched_load_avg_cpu:   cpu=2 load_avg=1024 util_avg=542
       busy_loop-1551  [001]   164.751794: sched_stat_runtime:   comm=busy_loop pid=1551 runtime=4000120 [ns] vruntime=280056341 [ns]
       busy_loop-1549  [003]   164.751795: sched_load_avg_task:  comm=busy_loop pid=1549 cpu=3 load_avg=1021 util_avg=376 load_sum=48771927 util_sum=17985591 period_contrib=884
       busy_loop-1553  [000]   164.751795: sched_load_avg_task:  comm=busy_loop pid=1553 cpu=0 load_avg=1015 util_avg=538 load_sum=48458496 util_sum=25697835 period_contrib=577
       busy_loop-1551  [001]   164.751795: sched_load_avg_task:  comm=busy_loop pid=1551 cpu=1 load_avg=1018 util_avg=539 load_sum=48619308 util_sum=25780552 period_contrib=734
       busy_loop-1550  [005]   164.751795: sched_load_avg_task:  comm=busy_loop pid=1550 cpu=5 load_avg=1019 util_avg=540 load_sum=48685865 util_sum=25814558 period_contrib=799
       busy_loop-1552  [002]   164.751796: sched_load_avg_cpu:   cpu=2 load_avg=169 util_avg=535
       busy_loop-1551  [001]   164.751796: sched_load_avg_cpu:   cpu=1 load_avg=1011 util_avg=536
       busy_loop-1553  [000]   164.751797: sched_load_avg_cpu:   cpu=0 load_avg=1018 util_avg=539
       busy_loop-1549  [003]   164.751797: sched_load_avg_cpu:   cpu=3 load_avg=2020 util_avg=535
       busy_loop-1550  [005]   164.751797: sched_load_avg_cpu:   cpu=5 load_avg=1012 util_avg=536
       busy_loop-1551  [001]   164.751797: sched_load_avg_cpu:   cpu=1 load_avg=168 util_avg=533
       busy_loop-1553  [000]   164.751799: sched_load_avg_cpu:   cpu=0 load_avg=169 util_avg=538
       busy_loop-1549  [003]   164.751799: sched_load_avg_cpu:   cpu=3 load_avg=336 util_avg=533
       busy_loop-1550  [005]   164.751800: sched_load_avg_cpu:   cpu=5 load_avg=171 util_avg=543
       busy_loop-1549  [003]   164.751807: sched_stat_runtime:   comm=busy_loop pid=1549 runtime=13700 [ns] vruntime=390393848 [ns]
       busy_loop-1549  [003]   164.751809: sched_load_avg_task:  comm=busy_loop pid=1549 cpu=3 load_avg=1021 util_avg=376 load_sum=48785239 util_sum=17998903 period_contrib=897
       busy_loop-1549  [003]   164.751811: sched_load_avg_cpu:   cpu=3 load_avg=2020 util_avg=535
       busy_loop-1549  [003]   164.751812: sched_load_avg_task:  comm=busy_loop pid=1554 cpu=3 load_avg=1015 util_avg=163 load_sum=48472554 util_sum=7827475 period_contrib=593
       busy_loop-1549  [003]   164.751814: sched_load_avg_cpu:   cpu=3 load_avg=2020 util_avg=535
       busy_loop-1549  [003]   164.751816: sched_switch:         busy_loop:1549 [120] R ==> busy_loop:1554 [120]
       busy_loop-1552  [002]   164.755792: sched_stat_runtime:   comm=busy_loop pid=1552 runtime=3999800 [ns] vruntime=617431928 [ns]
       busy_loop-1553  [000]   164.755793: sched_stat_runtime:   comm=busy_loop pid=1553 runtime=3999880 [ns] vruntime=602391074 [ns]
       busy_loop-1552  [002]   164.755793: sched_load_avg_task:  comm=busy_loop pid=1552 cpu=2 load_avg=1011 util_avg=574 load_sum=48306205 util_sum=27414009 period_contrib=428
       busy_loop-1550  [005]   164.755793: sched_stat_runtime:   comm=busy_loop pid=1550 runtime=3999780 [ns] vruntime=210319014 [ns]
       busy_loop-1554  [003]   164.755793: sched_stat_runtime:   comm=busy_loop pid=1554 runtime=3986540 [ns] vruntime=382907621 [ns]
       busy_loop-1552  [002]   164.755794: sched_load_avg_cpu:   cpu=2 load_avg=1019 util_avg=578
       busy_loop-1551  [001]   164.755794: sched_stat_runtime:   comm=busy_loop pid=1551 runtime=3999860 [ns] vruntime=284056201 [ns]
       busy_loop-1553  [000]   164.755795: sched_load_avg_task:  comm=busy_loop pid=1553 cpu=0 load_avg=1010 util_avg=573 load_sum=48264247 util_sum=27392629 period_contrib=387
       busy_loop-1551  [001]   164.755795: sched_load_avg_task:  comm=busy_loop pid=1551 cpu=1 load_avg=1014 util_avg=575 load_sum=48425055 util_sum=27481823 period_contrib=544
       busy_loop-1552  [002]   164.755795: sched_load_avg_cpu:   cpu=2 load_avg=168 util_avg=570
       busy_loop-1550  [005]   164.755795: sched_load_avg_task:  comm=busy_loop pid=1550 cpu=5 load_avg=1015 util_avg=576 load_sum=48491612 util_sum=27518531 period_contrib=609
       busy_loop-1554  [003]   164.755796: sched_load_avg_task:  comm=busy_loop pid=1554 cpu=3 load_avg=1010 util_avg=230 load_sum=48265186 util_sum=10993484 period_contrib=390
       busy_loop-1551  [001]   164.755796: sched_load_avg_cpu:   cpu=1 load_avg=1007 util_avg=571
       busy_loop-1553  [000]   164.755796: sched_load_avg_cpu:   cpu=0 load_avg=1014 util_avg=575
       busy_loop-1550  [005]   164.755797: sched_load_avg_cpu:   cpu=5 load_avg=1008 util_avg=572
       busy_loop-1554  [003]   164.755797: sched_load_avg_cpu:   cpu=3 load_avg=2012 util_avg=571
       busy_loop-1551  [001]   164.755797: sched_load_avg_cpu:   cpu=1 load_avg=171 util_avg=581
       busy_loop-1553  [000]   164.755799: sched_load_avg_cpu:   cpu=0 load_avg=168 util_avg=574
       busy_loop-1550  [005]   164.755799: sched_load_avg_cpu:   cpu=5 load_avg=170 util_avg=579
       busy_loop-1554  [003]   164.755799: sched_load_avg_cpu:   cpu=3 load_avg=342 util_avg=581
       busy_loop-1552  [002]   164.759791: sched_cfs_idle_inject_timer: throttled=1
       busy_loop-1551  [001]   164.759791: sched_cfs_idle_inject_timer: throttled=1
       busy_loop-1550  [005]   164.759792: sched_cfs_idle_inject_timer: throttled=1
       busy_loop-1554  [003]   164.759792: sched_cfs_idle_inject_timer: throttled=1
       busy_loop-1553  [000]   164.759792: sched_cfs_idle_inject_timer: throttled=1


Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ