lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+icZUXLyFqq0y_GnKca8MS4wO2kcj4K-D1kBHLa8u_pnLZ7eQ@mail.gmail.com>
Date:   Mon, 15 Mar 2021 18:04:41 +0100
From:   Sedat Dilek <sedat.dilek@...il.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        rostedt@...dmis.org, hpa@...or.com, torvalds@...uxfoundation.org,
        linux-kernel@...r.kernel.org, linux-toolchains@...r.kernel.org,
        jpoimboe@...hat.com, alexei.starovoitov@...il.com,
        mhiramat@...nel.org
Subject: Re: [PATCH 0/2] x86: Remove ideal_nops[]

On Sat, Mar 13, 2021 at 2:47 PM Sedat Dilek <sedat.dilek@...il.com> wrote:
[ ... ]
> Let me look if I will do a selfmade ThinLTO+PGO optimized LLVM
> toolchain v12.0.0-rc3 this weekend.
>

I did it.

Here some fresh numbers:

[ Selfmade LLVM toolchain v12.0.0-rc3 "stage1-only" ]
[ Host-Kernel: 5.12.0-rc2-8-amd64-clang12-cfi includes Peter's NOPS patchset ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-9-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-13 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-9~bullseye+dileks1':

      55936351.95 msec task-clock                #    3.580 CPUs
utilized
          8291848      context-switches          #    0.148 K/sec
           269686      cpu-migrations            #    0.005 K/sec
        288389721      page-faults               #    0.005 M/sec
  108344049253836      cycles                    #    1.937 GHz
   83228135285263      stalled-cycles-frontend   #   76.82% frontend
cycles idle
   65616255370809      stalled-cycles-backend    #   60.56% backend
cycles idle
   59590373937199      instructions              #    0.55  insn per
cycle
                                                 #    1.40  stalled
cycles per insn
   10906265495505      branches                  #  194.976 M/sec
     488578274434      branch-misses             #    4.48% of all
branches

  15622.926203302 seconds time elapsed

  53453.974928000 seconds user
   2526.773533000 seconds sys


[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: Debian's 5.10.19-1 kernel ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-10-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-14 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-10~bullseye+dileks1':

      40223080.69 msec task-clock                #    3.434 CPUs
utilized
          7438923      context-switches          #    0.185 K/sec
           245636      cpu-migrations            #    0.006 K/sec
        288073015      page-faults               #    0.007 M/sec
   77325441657129      cycles                    #    1.922 GHz
   55357463522675      stalled-cycles-frontend   #   71.59% frontend
cycles idle
   38978871249074      stalled-cycles-backend    #   50.41% backend
cycles idle
   55178265045056      instructions              #    0.71  insn per
cycle
                                                 #    1.00  stalled
cycles per insn
    9749166033571      branches                  #  242.377 M/sec
     431303563167      branch-misses             #    4.42% of all
branches

  11714.751645982 seconds time elapsed

  37951.117840000 seconds user
   2313.807151000 seconds sys


[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: 5.12.0-rc2-10-amd64-clang12-cfi includes Peter's NOPS patchset ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-1-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-15 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc3-1~bullseye+dileks1':

      40632207.25 msec task-clock                #    3.406 CPUs
utilized
          8216832      context-switches          #    0.202 K/sec
           277610      cpu-migrations            #    0.007 K/sec
        281331052      page-faults               #    0.007 M/sec
   77031538570411      cycles                    #    1.896 GHz
              (83.33%)
   55247905369487      stalled-cycles-frontend   #   71.72% frontend
cycles idle     (83.33%)
   39046795510242      stalled-cycles-backend    #   50.69% backend
cycles idle      (66.67%)
   54592585444704      instructions              #    0.71  insn per
cycle
                                                 #    1.01  stalled
cycles per insn  (83.33%)
    9641589406714      branches                  #  237.289 M/sec
              (83.33%)
     435317273069      branch-misses             #    4.51% of all
branches          (83.33%)

  11928.047003788 seconds time elapsed

  38187.685111000 seconds user
   2502.075987000 seconds sys

As said in an earlier email:
A ThinLTO+PGO optimized LLVM-toolchain saves here approx. 60mins of build-time.

Depending on the host-kernel including Peter's NOPS patchset: 3mins
longer build-time.
Brewing time of one single Turkish Tea bag.

Attached are the 3 build-time log-files.

- Sedat -

View attachment "build-time_5.12.0-rc2-9-amd64-clang12-cfi.txt" of type "text/plain" (1344 bytes)

View attachment "build-time_5.12.0-rc2-10-amd64-clang12-cfi.txt" of type "text/plain" (1346 bytes)

View attachment "build-time_5.12.0-rc3-1-amd64-clang12-cfi.txt" of type "text/plain" (1404 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ