[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+icZUXLyFqq0y_GnKca8MS4wO2kcj4K-D1kBHLa8u_pnLZ7eQ@mail.gmail.com>
Date: Mon, 15 Mar 2021 18:04:41 +0100
From: Sedat Dilek <sedat.dilek@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
rostedt@...dmis.org, hpa@...or.com, torvalds@...uxfoundation.org,
linux-kernel@...r.kernel.org, linux-toolchains@...r.kernel.org,
jpoimboe@...hat.com, alexei.starovoitov@...il.com,
mhiramat@...nel.org
Subject: Re: [PATCH 0/2] x86: Remove ideal_nops[]
On Sat, Mar 13, 2021 at 2:47 PM Sedat Dilek <sedat.dilek@...il.com> wrote:
[ ... ]
> Let me look if I will do a selfmade ThinLTO+PGO optimized LLVM
> toolchain v12.0.0-rc3 this weekend.
>
I did it.
Here some fresh numbers:
[ Selfmade LLVM toolchain v12.0.0-rc3 "stage1-only" ]
[ Host-Kernel: 5.12.0-rc2-8-amd64-clang12-cfi includes Peter's NOPS patchset ]
Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-9-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-13 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-9~bullseye+dileks1':
55936351.95 msec task-clock # 3.580 CPUs
utilized
8291848 context-switches # 0.148 K/sec
269686 cpu-migrations # 0.005 K/sec
288389721 page-faults # 0.005 M/sec
108344049253836 cycles # 1.937 GHz
83228135285263 stalled-cycles-frontend # 76.82% frontend
cycles idle
65616255370809 stalled-cycles-backend # 60.56% backend
cycles idle
59590373937199 instructions # 0.55 insn per
cycle
# 1.40 stalled
cycles per insn
10906265495505 branches # 194.976 M/sec
488578274434 branch-misses # 4.48% of all
branches
15622.926203302 seconds time elapsed
53453.974928000 seconds user
2526.773533000 seconds sys
[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: Debian's 5.10.19-1 kernel ]
Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-10-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-14 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-10~bullseye+dileks1':
40223080.69 msec task-clock # 3.434 CPUs
utilized
7438923 context-switches # 0.185 K/sec
245636 cpu-migrations # 0.006 K/sec
288073015 page-faults # 0.007 M/sec
77325441657129 cycles # 1.922 GHz
55357463522675 stalled-cycles-frontend # 71.59% frontend
cycles idle
38978871249074 stalled-cycles-backend # 50.41% backend
cycles idle
55178265045056 instructions # 0.71 insn per
cycle
# 1.00 stalled
cycles per insn
9749166033571 branches # 242.377 M/sec
431303563167 branch-misses # 4.42% of all
branches
11714.751645982 seconds time elapsed
37951.117840000 seconds user
2313.807151000 seconds sys
[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: 5.12.0-rc2-10-amd64-clang12-cfi includes Peter's NOPS patchset ]
Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-1-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@...il.com
KBUILD_BUILD_TIMESTAMP=2021-03-15 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc3-1~bullseye+dileks1':
40632207.25 msec task-clock # 3.406 CPUs
utilized
8216832 context-switches # 0.202 K/sec
277610 cpu-migrations # 0.007 K/sec
281331052 page-faults # 0.007 M/sec
77031538570411 cycles # 1.896 GHz
(83.33%)
55247905369487 stalled-cycles-frontend # 71.72% frontend
cycles idle (83.33%)
39046795510242 stalled-cycles-backend # 50.69% backend
cycles idle (66.67%)
54592585444704 instructions # 0.71 insn per
cycle
# 1.01 stalled
cycles per insn (83.33%)
9641589406714 branches # 237.289 M/sec
(83.33%)
435317273069 branch-misses # 4.51% of all
branches (83.33%)
11928.047003788 seconds time elapsed
38187.685111000 seconds user
2502.075987000 seconds sys
As said in an earlier email:
A ThinLTO+PGO optimized LLVM-toolchain saves here approx. 60mins of build-time.
Depending on the host-kernel including Peter's NOPS patchset: 3mins
longer build-time.
Brewing time of one single Turkish Tea bag.
Attached are the 3 build-time log-files.
- Sedat -
View attachment "build-time_5.12.0-rc2-9-amd64-clang12-cfi.txt" of type "text/plain" (1344 bytes)
View attachment "build-time_5.12.0-rc2-10-amd64-clang12-cfi.txt" of type "text/plain" (1346 bytes)
View attachment "build-time_5.12.0-rc3-1-amd64-clang12-cfi.txt" of type "text/plain" (1404 bytes)
Powered by blists - more mailing lists