lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YW1sCxRUZBX8iL6w@zn.tnic>
Date:   Mon, 18 Oct 2021 14:43:55 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     JY Ni <jiayu.ni@...ux.alibaba.com>
Cc:     Luming Yu <luming.yu@...il.com>,
        wujinhua <wujinhua@...ux.alibaba.com>, x86 <x86@...nel.org>,
        "zelin.deng" <zelin.deng@...ux.alibaba.com>,
        ak <ak@...ux.intel.com>, "luming.yu" <luming.yu@...el.com>,
        "fan.du" <fan.du@...el.com>,
        "artie.ding" <artie.ding@...ux.alibaba.com>,
        "tony.luck" <tony.luck@...el.com>, tglx <tglx@...utronix.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "pawan.kumar.gupta" <pawan.kumar.gupta@...ux.intel.com>,
        "fenghua.yu" <fenghua.yu@...el.com>, hpa <hpa@...or.com>,
        "ricardo.neri-calderon" <ricardo.neri-calderon@...ux.intel.com>,
        peterz <peterz@...radead.org>
Subject: Re: 回复:[PATCH] perf: optimize clear page in Intel specified model with movq instruction

On Mon, Oct 18, 2021 at 03:43:46PM +0800, JY Ni wrote:
> _*Precondition:*__*do tests on a Intel CPX server.*_ CPU information of my
> test machine is in backup part._*

My machine:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 106
stepping        : 4

That's a SKYLAKE_X.

I ran

./tools/perf/perf stat --repeat 5 --sync --pre=/root/bin/pre-build-kernel.sh -- make -s -j96 bzImage

on -rc6, building allmodconfig each of the 10 times.

pre-build-kernel.sh is

---
#!/bin/bash

make -s clean
echo 3 > /proc/sys/vm/drop_caches
---

Results are below but to me that's all "in the noise" with around one
percent if I can trust the stddev. Which is not even close to 40%.

So basically you're wasting your time.

5.15-rc6
--------

# ./tools/perf/perf stat --repeat 5 --sync --pre=/root/bin/pre-build-kernel.sh -- make -s -j96 bzImage

 Performance counter stats for 'make -s -j96 bzImage' (5 runs):

      3,072,392.92 msec task-clock                #   51.109 CPUs utilized            ( +-  0.05% )
         1,351,534      context-switches          #  440.257 /sec                     ( +-  0.99% )
           224,862      cpu-migrations            #   73.248 /sec                     ( +-  1.39% )
        85,073,723      page-faults               #   27.712 K/sec                    ( +-  0.01% )
 8,743,357,421,495      cycles                    #    2.848 GHz                      ( +-  0.06% )
 7,643,946,991,468      instructions              #    0.88  insn per cycle           ( +-  0.00% )
 1,705,128,638,240      branches                  #  555.440 M/sec                    ( +-  0.00% )
    37,637,576,027      branch-misses             #    2.21% of all branches          ( +-  0.03% )
22,511,903,971,150      slots                     #    7.333 G/sec                    ( +-  0.03% )
 7,377,211,958,188      topdown-retiring          #     32.5% retiring                ( +-  0.02% )
 3,145,247,374,138      topdown-bad-spec          #     13.9% bad speculation         ( +-  0.27% )
 8,018,664,899,041      topdown-fe-bound          #     35.2% frontend bound          ( +-  0.07% )
 4,167,103,609,622      topdown-be-bound          #     18.3% backend bound           ( +-  0.09% )

            60.114 +- 0.112 seconds time elapsed  ( +-  0.19% )



5.15-rc6 + patch
----------------

 Performance counter stats for 'make -s -j96 bzImage' (5 runs):

      3,033,250.65 msec task-clock                #   51.243 CPUs utilized            ( +-  0.05% )
         1,329,033      context-switches          #  438.210 /sec                     ( +-  0.64% )
           225,550      cpu-migrations            #   74.369 /sec                     ( +-  1.36% )
        85,080,938      page-faults               #   28.053 K/sec                    ( +-  0.00% )
 8,629,663,367,477      cycles                    #    2.845 GHz                      ( +-  0.05% )
 7,696,237,813,803      instructions              #    0.89  insn per cycle           ( +-  0.00% )
 1,709,909,494,107      branches                  #  563.793 M/sec                    ( +-  0.00% )
    37,719,552,337      branch-misses             #    2.21% of all branches          ( +-  0.02% )
22,214,249,023,820      slots                     #    7.325 G/sec                    ( +-  0.06% )
 7,412,342,725,008      topdown-retiring          #     33.0% retiring                ( +-  0.01% )
 3,141,090,408,028      topdown-bad-spec          #     14.1% bad speculation         ( +-  0.17% )
 7,996,077,873,517      topdown-fe-bound          #     35.6% frontend bound          ( +-  0.03% )
 3,862,154,886,962      topdown-be-bound          #     17.3% backend bound           ( +-  0.28% )

            59.193 +- 0.302 seconds time elapsed  ( +-  0.51% )

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ