lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190515111745.dj6jhf4lypppl3tf@vireshk-i7>
Date:   Wed, 15 May 2019 16:47:45 +0530
From:   Viresh Kumar <viresh.kumar@...aro.org>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>, tkjos@...gle.com,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        quentin.perret@...aro.org, chris.redpath@....com,
        Dietmar.Eggemann@....com, linux-kernel@...r.kernel.org,
        liu.song.a23@...il.com, steven.sistare@...cle.com,
        subhra.mazumdar@...cle.com
Subject: Re: [RFC V2 0/2] sched/fair: Fallback to sched-idle CPU for better
 performance

On 25-04-19, 15:07, Viresh Kumar wrote:
> Hi,
> 
> Here is another attempt to get some benefit out of the sched-idle
> policy. The previous version [1] focused on getting better power numbers
> and this version tries to get better performance or lower response time
> for the tasks.
> 
> The first patch is unchanged from v1 and accumulates
> information about sched-idle tasks per CPU.
> 
> The second patch changes the way the target CPU is selected in the fast
> path. Currently, we target for an idle CPU in select_idle_sibling() to
> run the next task, but in case we don't find idle CPUs it is better to
> pick a CPU which will run the task the soonest, for performance reason.
> A CPU which isn't idle but has only SCHED_IDLE activity queued on it
> should be a good target based on this criteria as any normal fair task
> will most likely preempt the currently running SCHED_IDLE task
> immediately. In fact, choosing a SCHED_IDLE CPU shall give better
> results as it should be able to run the task sooner than an idle CPU
> (which requires to be woken up from an idle state).
> 
> Basic testing is done with the help of rt-app currently to make sure the
> task is getting placed correctly.

More results here:

- Tested on Octacore Hikey platform (all CPUs change frequency
  together).

- rt-app json attached here. It creates few tasks and we monitor the
  scheduling latency for them by looking at "wu_lat" field (usec).

- The histograms are created using
  https://github.com/adkein/textogram: textogram -a 0 -z 1000 -n 10

- the stats are accumulated using: https://github.com/nferraz/st

- NOTE: The % values shown don't add up, just look at total numbers
  instead


Test 1: Create 8 CFS tasks (no SCHED_IDLE tasks) without this
patchset:

           0 - 100  : ##################################################   72% (3688)
         100 - 200  : ################                                     24% (1253)
         200 - 300  : ##                                                    2% (149)
         300 - 400  :                                                       0% (22)
         400 - 500  :                                                       0% (1)
         500 - 600  :                                                       0% (3)
         600 - 700  :                                                       0% (1)
         700 - 800  :                                                       0% (1)
         800 - 900  :
         900 - 1000 :                                                       0% (1)
              >1000 : 0% (17)

        N       min     max     sum     mean    stddev
        5136    0       2452    535985  104.358 104.585


Test 2: Create 8 CFS tasks and 5 SCHED_IDLE tasks:

        A. Without sched-idle patchset:

           0 - 100  : ##################################################   88% (3102)
         100 - 200  : ##                                                    4% (148)
         200 - 300  :                                                       1% (41)
         300 - 400  :                                                       0% (27)
         400 - 500  :                                                       0% (33)
         500 - 600  :                                                       0% (32)
         600 - 700  :                                                       1% (36)
         700 - 800  :                                                       0% (27)
         800 - 900  :                                                       0% (19)
         900 - 1000 :                                                       0% (26)
              >1000 : 34% (1218)

        N       min     max     sum             mean    stddev
        4710    0       67664   5.25956e+06     1116.68 2315.09


        B. With sched-idle patchset:

           0 - 100  : ##################################################   99% (5042)
         100 - 200  :                                                       0% (8)
         200 - 300  :
         300 - 400  :
         400 - 500  :                                                       0% (2)
         500 - 600  :                                                       0% (1)
         600 - 700  :
         700 - 800  :                                                       0% (1)
         800 - 900  :                                                       0% (1)
         900 - 1000 :
              >1000 : 0% (40)

        N       min     max     sum     mean    stddev
        5095    0       7773    523170  102.683 475.482


The mean latency dropped to 10% and the stddev to around 25% with this
patchset.

I have tried more combinations of CFS and SCHED_IDLE tasks and see
expected improvement in scheduling latency for all of them.

-- 
viresh

Download attachment "sched-idle.json" of type "application/json" (841 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ