[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <80734605-1926-4ac7-9c63-006fe3ea6b6a@amd.com>
Date: Wed, 31 Jul 2024 14:27:25 +0530
From: Shivank Garg <shivankg@....com>
To: kirill.shutemov@...ux.intel.com
Cc: ardb@...nel.org, bp@...en8.de, brijesh.singh@....com, corbet@....net,
dave.hansen@...ux.intel.com, hpa@...or.com, jan.kiszka@...mens.com,
jgross@...e.com, kbingham@...nel.org, linux-doc@...r.kernel.org,
linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
luto@...nel.org, michael.roth@....com, mingo@...hat.com,
peterz@...radead.org, rick.p.edgecombe@...el.com, sandipan.das@....com,
tglx@...utronix.de, thomas.lendacky@....com, x86@...nel.org
Subject: Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for
x86-64
I did some experiments to understand the impact of making 5 level page tables
default.
Machine Info: AMD Zen 4 EPYC server (2-socket system, 128 cores and 1 NUMA
node per socket, SMT Enabled)
Size of each NUMA node is approx 377 GB.
For experiments, I'm binding the benchmark to CPUs and memory nodes of single
socket for consistent results. Measured by enabling/disabling 5level Page
table using CONFIG_X86_5LEVEL.
% Change: (5L-4L)/4L*100
CoV (%): Coefficient of Variation (%)
Results:
lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better
4-Level PT 5-Level PT % Change
THP-never Mean:0.4068 Mean:0.4294 5.56
95% CI:0.4057-0.4078 95% CI:0.4287-0.4302
THP-Always Mean: 0.4061 Mean: 0.4288 % Change
95% CI: 0.4051-0.4071 95% CI: 0.4281-0.4295 5.59
Btree (Thread:32): Metric- Time Taken (in seconds) - Lower is better
4-Level 5-Level
Time Taken(s) CoV (%) Time Taken(s) CoV(%) % Change
THP Never 382.2 0.219 388.8 1.019 1.73
THP Madvise 383.0 0.261 384.8 0.809 0.47
THP Always 392.8 1.376 386.4 2.147 -1.63
Btree (Thread:256): Metric- Time Taken (in seconds) - Lower is better
4-Level 5-Level
Time Taken(s) CoV (%) Time Taken(s) CoV(%) % Change
THP Never 56.6 2.014 55.2 0.810 -2.47
THP Madvise 56.6 2.014 56.4 2.022 -0.35
THP Always 56.6 0.968 56.2 1.489 -0.71
Ebizzy: Metric- records/s - Higher is better
4-Level 5-Level
Threads record/s CoV (%) record/s CoV(%) % Change
1 844 0.302 837 0.196 -0.85
256 10160 0.315 10288 1.081 1.26
XSBench (Thread:256, THP:Never) - Higher is better
Metric 4-Level 5-Level % Change
Lookups/s 13720556 13396288 -2.36
CoV (%) 1.726 1.317
Hashjoin (Thread:256, THP:Never) - Lower is better
Metric 4-Level 5-Level % Change
Time taken(s) 424.4 427.4 0.707
CoV (%) 0.394 0.209
Graph500(Thread:256, THP:Madvise) - Lower is better
Metric 4-Level 5-Level % Change
Time Taken(s) 0.1879 0.1873 -0.32
CoV (%) 0.165 0.213
GUPS(Thread:128, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
GUPS 1.3265 1.3252 -0.10
CoV (%) 0.037 0.027
pagerank(Thread:256, THP:Madvise) - Lower is better
Metric 4-Level 5-Level % Change
Time taken(s) 143.67 143.67 0.00
CoV (%) 0.402 0.402
Redis(Thread:256, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
Throughput(Ops/s) 141030744 139586376 -1.02
CoV (%) 0.372 0.561
memcached(Thread:256, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
Throughput(Ops/s) 19916313 19743637 -0.87
CoV (%) 0.051 0.095
Inference:
5-level page table shows increase in page-fault latency but it does
not significantly impact other benchmarks.
Thanks,
Shivank
Powered by blists - more mailing lists