lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20130907123819.GA705@localhost>
Date:	Sat, 7 Sep 2013 20:38:19 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Yuanhan Liu <yuanhan.liu@...ux.intel.com>,
	"Huang, Ying" <ying.huang@...el.com>,
	Fengguang Wu <fengguang.wu@...el.com>,
	Mike Galbraith <efault@....de>, Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: 3-5% increased netperf throughput by "sched: Micro-optimize the
 smart wake-affine logic"

Hi Peter,

We are glad to report some measurable performance improvements by your
commit

commit 7d9ffa8961482232d964173cccba6e14d2d543b2
Author: Peter Zijlstra <peterz@...radead.org>
Date:   Thu Jul 4 12:56:46 2013 +0800

    sched: Micro-optimize the smart wake-affine logic
    
    Smart wake-affine is using node-size as the factor currently, but the overhead
    of the mask operation is high.
    
    Thus, this patch introduce the 'sd_llc_size' percpu variable, which will record
    the highest cache-share domain size, and make it to be the new factor, in order
    to reduce the overhead and make it more reasonable.
    
    Tested-by: Davidlohr Bueso <davidlohr.bueso@...com>
    Tested-by: Michael Wang <wangyun@...ux.vnet.ibm.com>
    Signed-off-by: Peter Zijlstra <peterz@...radead.org>
    Acked-by: Michael Wang <wangyun@...ux.vnet.ibm.com>
    Cc: Mike Galbraith <efault@....de>
    Link: http://lkml.kernel.org/r/51D5008E.6030102@linux.vnet.ibm.com
    [ Tidied up the changelog. ]
    Signed-off-by: Ingo Molnar <mingo@...nel.org>

:040000 040000 e7c8a8c55bfa1261f3c6b75674a83eb76bb88a3f 129777b8d0b74ce189760ad76d9aaecd65b7ee7f M	kernel
bisect run success

# bad: [37570e7ef5be99ba5188bb17ed547ac4bbf65e73] Merge remote-tracking branch 'nfc-next/master' into devel-hourly-2013090406
# good: [6e4664525b1db28f8c4e1130957f70a94c19213e] Linux 3.11
git bisect start '37570e7ef5be99ba5188bb17ed547ac4bbf65e73' '6e4664525b1db28f8c4e1130957f70a94c19213e' '--'
# good: [8bcaa20433634ac70c96d9e5f8ece4b8577c9694] Merge remote-tracking branch 'arm-soc/for-next' into devel-hourly-2013090406
git bisect good 8bcaa20433634ac70c96d9e5f8ece4b8577c9694
# good: [820acdf740b7d04476959189e9a144c2315339a4] drm/i915: do display power state notification on crtc enable/disable
git bisect good 820acdf740b7d04476959189e9a144c2315339a4
# bad: [5bae522a51aa6bbae54bd2d745d0320f74c40b76] Merge remote-tracking branch 'perf/perf/trace.fmt' into devel-hourly-2013090406
git bisect bad 5bae522a51aa6bbae54bd2d745d0320f74c40b76
# bad: [8afb4c018e21c882c8fad196772ef74d494185e2] perf tools: Re-implement debug print function for linking python/perf.so
git bisect bad 8afb4c018e21c882c8fad196772ef74d494185e2
# good: [17f41571bb2c4a398785452ac2718a6c5d77180e] kprobes/x86: Call out into INT3 handler directly instead of using notifier
git bisect good 17f41571bb2c4a398785452ac2718a6c5d77180e
# bad: [34f77abcb34e1da4ee3ca5c5a41b673664eee1fa] perf annotate: Put dso name in symbol annotation title
git bisect bad 34f77abcb34e1da4ee3ca5c5a41b673664eee1fa
# bad: [8404db63461af62025f32f8368861fb33604e62f] perf tests: Add attr record group sampling test
git bisect bad 8404db63461af62025f32f8368861fb33604e62f
# bad: [9a545de019b536771feefb76f85e5038b65c2190] perf: Migrate per cpu event accounting
git bisect bad 9a545de019b536771feefb76f85e5038b65c2190
# good: [62470419e993f8d9d93db0effd3af4296ecb79a5] sched: Implement smarter wake-affine logic
git bisect good 62470419e993f8d9d93db0effd3af4296ecb79a5
# bad: [90983b16078ab0fdc58f0dab3e8e3da79c9579a2] perf: Sanitize get_callchain_buffer()
git bisect bad 90983b16078ab0fdc58f0dab3e8e3da79c9579a2
# bad: [6050cb0b0b366092d1383bc23d7b16cd26db00f0] perf: Fix branch stack refcount leak on callchain init failure
git bisect bad 6050cb0b0b366092d1383bc23d7b16cd26db00f0
# bad: [7d9ffa8961482232d964173cccba6e14d2d543b2] sched: Micro-optimize the smart wake-affine logic
git bisect bad 7d9ffa8961482232d964173cccba6e14d2d543b2
# first bad commit: [7d9ffa8961482232d964173cccba6e14d2d543b2] sched: Micro-optimize the smart wake-affine logic

A comparison of all good commits [*] with all bad commits [o]
(good/bad in the sense of git bisect)

                              netperf.Throughput_Mbps

   208 ++-------------------------------------------------------------------+
   206 +OOO O OOO     OOOO       O      O     O O O       O   O             |
       O     O   O   O       OO O   O O                 O          O   O O  O
   204 ++                  O   O   O O O  OOOO O   OOOOO   OOO O OO OOO   OO|
   202 ++          O        O                                               |
       |            O                                                       |
   200 ++                                                                   |
   198 ++                                                                   |
   196 ++                                                                   |
       |                                      *                             |
   194 ++                     ****. ***  .**** **.*******.* ***             |
   192 ++                    *     *   **                  ::               |
       | *   ***          .* :                             *                |
   190 ** *.*   **.** ****  *                                               |
   188 ++------------*------------------------------------------------------+


                                  vmstat.system.in

   1640 ++----------O-------------------------------------------------------+
        O O    OO  O   O         O                O                         |
   1620 +O    O   O   O O      O   O                O      O     O     O   OO
        |    O           OOO        OO   OO      O      O O O O    OO O     |
   1600 ++ O     O            O O       O      O     OO      O    O  O   O  |
        |                   O     O    O   O O           O      O         O |
   1580 ++                                  O   O  O                        |
        |                                                                   |
   1560 ++                                                   *              |
        |                       *          *     *        *  :*             |
   1540 ++ *                    :: ***.* * :**.* ::* * .** :*               |
        |  :+ * *     *       ** **     ::*     * * * *    *                |
   1520 ++*  * ::*** +:   ** :          *                                   |
        * :    *    *  :**  ::                                              |
   1500 +*-------------*----*-----------------------------------------------+


                                  vmstat.system.cs

   10000 ++-----------------------------------------------------------------+
         *****.*****                                                        |
    9800 ++         ** .******                                              |
    9600 ++           *       :*    **  **.*  * *** .* * ****               |
         |                    * *.**  **    ** *   *  * *    *.*            |
    9400 ++                                                                 |
    9200 ++                                                                 |
         |                                                                  |
    9000 ++                                                                 |
    8800 ++  O                                                              |
         O  O   OO O                    O  OOO  O    O OOO  OO  OO  OO    O |
    8600 ++O   O  O  O  OO   OO O   O O  O    OO   O  O   OO      O   O OO OO
    8400 +O         O     OOO  O  OO O O         OO            O   O        |
         |            O                                                     |
    8200 ++-----------------------------------------------------------------+


                          lock_stat.slock-AF_INET.contentions

   110000 ++----------------------------------------------------------------+
          |                                                                 |
   105000 ++      O     O         O                O                        |
          OOO   O   O O  O OO   O   O O           O  O    O O               O
          |   O  O   O O     OOO O O   OOOOO   O      OOO    OOO O OOO OOO O|
   100000 ++ O     O                        OOO     O    O      O O       O |
          |                                      O                          |
    95000 ++                                                                |
          |                                                                 |
    90000 ++                     * * .*** * ** *. * ***      ***            |
          |            *       ** * *    * *  *  * *   ****.*               |
          |  **.* **** ::*. ** :                                            |
    85000 ***    *    * *  *  *                                             |
          |                                                                 |
    80000 ++----------------------------------------------------------------+


                lock_stat.slock-AF_INET.contentions.lock_sock_nested

   92000 ++-----------------------------------------------------------------+
   90000 ++             O         O               O                         |
         | O     O         O   O    O                O                      |
   88000 OO    O   O O   OO O   O    O  OO       O    OO  OO O O        O  OO
   86000 ++     O   O O      OO    O  OO   OO OO        O   O    OOOOOO  O  |
   84000 ++ OO    O                          O     O     O      O         O |
   82000 ++                                     O                           |
         |                                                                  |
   80000 ++                                                                 |
   78000 ++                           *  *. *  * *           *.             |
   76000 ++           *       ***.** * * : * * :* **.** *** *  *            |
   74000 ++ **.* *    :+   ** :     *   *     *        *   *                |
         | *    * ***:  ***  *                                              |
   72000 **          *                                                      |
   70000 ++-----------------------------------------------------------------+


                    lock_stat.slock-AF_INET.contentions.tcp_v4_rcv

   110000 ++----------------------------------------------------------------+
          |                                                                 |
   105000 ++      O     O         O                O                        |
          OOO   O   O O  O OO   O   O O           O  O    O O               O
          |   O  O   O O     OOO O O   OOOOO   O      OOO    OOO O OOO OOO O|
   100000 ++ O     O                        OOO     O    O      O O       O |
          |                                      O                          |
    95000 ++                                                                |
          |                                                                 |
    90000 ++                     * * .*** * ** *. * ***      ***            |
          |            *       ** * *    * *  *  * *   ****.*               |
          |  **.* **** ::*. ** :                                            |
    85000 ***    *    * *  *  *                                             |
          |                                                                 |
    80000 ++----------------------------------------------------------------+

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ