lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210312205914.GG22098@zn.tnic>
Date:   Fri, 12 Mar 2021 21:59:14 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     x86@...nel.org, rostedt@...dmis.org, hpa@...or.com,
        torvalds@...uxfoundation.org, linux-kernel@...r.kernel.org,
        linux-toolchains@...r.kernel.org, jpoimboe@...hat.com,
        alexei.starovoitov@...il.com, mhiramat@...nel.org
Subject: Re: [PATCH 0/2] x86: Remove ideal_nops[]

On Fri, Mar 12, 2021 at 12:32:53PM +0100, Peter Zijlstra wrote:
> Since ultimate performance of a 10 year old chip (Intel Sandy Bridge, 2011) is
> simply irrelevant today, remove variable NOPs and use NOPL.

Just ran them on my SNB box:

cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz
stepping        : 7

with the usual perf stat kernel build workload with
CONFIG_DYNAMIC_FTRACE and CONFIG_FUNCTION_TRACER where each function has
a NOP at its beginning when ftrace is disabled (thx Steve).

./tools/perf/perf stat --repeat 5 --sync --pre=/root/bin/pre-build-kernel.sh -- make -s -j9 bzImage

before: tip-master

 Performance counter stats for 'make -s -j9 bzImage' (5 runs):

      3,213,728.10 msec task-clock                #    7.307 CPUs utilized            ( +-  0.01% )
           339,270      context-switches          #    0.106 K/sec                    ( +-  0.09% )
            31,472      cpu-migrations            #    0.010 K/sec                    ( +-  0.64% )
        62,070,684      page-faults               #    0.019 M/sec                    ( +-  0.01% )
11,498,198,009,323      cycles                    #    3.578 GHz                      ( +-  0.01% )  (83.33%)
 8,235,957,366,696      stalled-cycles-frontend   #   71.63% frontend cycles idle     ( +-  0.01% )  (83.33%)
 5,976,456,688,814      stalled-cycles-backend    #   51.98% backend cycles idle      ( +-  0.02% )  (66.67%)
 7,553,156,344,376      instructions              #    0.66  insn per cycle         
                                                  #    1.09  stalled cycles per insn  ( +-  0.00% )  (83.33%)
 1,635,468,917,524      branches                  #  508.901 M/sec                    ( +-  0.00% )  (83.34%)
    51,888,292,932      branch-misses             #    3.17% of all branches          ( +-  0.02% )  (83.33%)

           439.809 +- 0.156 seconds time elapsed  ( +-  0.04% )


after: tip-master-nops

 Performance counter stats for 'make -s -j9 bzImage' (5 runs):

      3,217,113.67 msec task-clock                #    7.307 CPUs utilized            ( +-  0.03% )
           339,425      context-switches          #    0.106 K/sec                    ( +-  0.20% )
            31,724      cpu-migrations            #    0.010 K/sec                    ( +-  0.54% )
        62,027,130      page-faults               #    0.019 M/sec                    ( +-  0.01% )
11,508,779,965,901      cycles                    #    3.577 GHz                      ( +-  0.03% )  (83.34%)
 8,241,212,210,440      stalled-cycles-frontend   #   71.61% frontend cycles idle     ( +-  0.04% )  (83.33%)
 5,982,615,533,177      stalled-cycles-backend    #   51.98% backend cycles idle      ( +-  0.06% )  (66.66%)
 7,546,407,430,314      instructions              #    0.66  insn per cycle         
                                                  #    1.09  stalled cycles per insn  ( +-  0.00% )  (83.33%)
 1,634,187,006,479      branches                  #  507.967 M/sec                    ( +-  0.00% )  (83.33%)
    51,941,580,371      branch-misses             #    3.18% of all branches          ( +-  0.01% )  (83.33%)

           440.266 +- 0.195 seconds time elapsed  ( +-  0.04% )


So here's numbers talk, bullshit walks. And with those numbers no
bullshit can remain lingering around anyway.

Cheers!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ