[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241113190156.2145593-1-coltonlewis@google.com>
Date: Wed, 13 Nov 2024 19:01:50 +0000
From: Colton Lewis <coltonlewis@...gle.com>
To: kvm@...r.kernel.org
Cc: Oliver Upton <oliver.upton@...ux.dev>, Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, Will Deacon <will@...nel.org>,
Russell King <linux@...linux.org.uk>, Catalin Marinas <catalin.marinas@....com>,
Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>, Naveen N Rao <naveen@...nel.org>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>, Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
Colton Lewis <coltonlewis@...gle.com>
Subject: [PATCH v8 0/5] Correct perf sampling with Guest VMs
v8:
* Improve patch 4 perf flags refactor
* Rebase to v6.12-rc7
v7:
https://lore.kernel.org/all/20241107190336.2963882-1-coltonlewis@google.com/
v6:
https://lore.kernel.org/all/20241105195603.2317483-1-coltonlewis@google.com/
v5:
https://lore.kernel.org/all/20240920174740.781614-1-coltonlewis@google.com/
v4:
https://lore.kernel.org/kvm/20240919190750.4163977-1-coltonlewis@google.com/
v3:
https://lore.kernel.org/kvm/20240912205133.4171576-1-coltonlewis@google.com/
v2:
https://lore.kernel.org/kvm/20240911222433.3415301-1-coltonlewis@google.com/
v1:
https://lore.kernel.org/kvm/20240904204133.1442132-1-coltonlewis@google.com/
This series cleans up perf recording around guest events and improves
the accuracy of the resulting perf reports.
Perf was incorrectly counting any PMU overflow interrupt that occurred
while a VCPU was loaded as a guest event even when the events were not
truely guest events. This lead to much less accurate and useful perf
recordings.
See as an example the below reports of `perf record
dirty_log_perf_test -m 2 -v 4` before and after the series on ARM64.
Without series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Overhead Command Shared Object Symbol
54.54% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.39% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
0.89% dirty_log_perf_ [kernel.vmlinux] [k] release_pages
0.70% dirty_log_perf_ [kernel.vmlinux] [k] free_pcppages_bulk
0.62% dirty_log_perf_ dirty_log_perf_test [.] userspace_mem_region_find
0.49% dirty_log_perf_ dirty_log_perf_test [.] sparsebit_is_set
0.46% dirty_log_perf_ dirty_log_perf_test [.] _virt_pg_map
0.46% dirty_log_perf_ dirty_log_perf_test [.] node_add
0.37% dirty_log_perf_ dirty_log_perf_test [.] node_reduce
0.35% dirty_log_perf_ [kernel.vmlinux] [k] free_unref_page_commit
0.33% dirty_log_perf_ [kernel.vmlinux] [k] __kvm_pgtable_walk
0.31% dirty_log_perf_ [kernel.vmlinux] [k] stage2_attr_walker
0.29% dirty_log_perf_ [kernel.vmlinux] [k] unmap_page_range
0.29% dirty_log_perf_ dirty_log_perf_test [.] test_assert
0.26% dirty_log_perf_ [kernel.vmlinux] [k] __mod_memcg_lruvec_state
0.24% dirty_log_perf_ [kernel.vmlinux] [k] kvm_s2_put_page
With series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Samples: 15K of event 'instructions', Event count (approx.): 30898031385
Overhead Command Shared Object Symbol
54.05% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.48% dirty_log_perf_ [kernel.kallsyms] [k] kvm_arch_vcpu_ioctl_run
4.70% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
3.11% dirty_log_perf_ [kernel.kallsyms] [k] kvm_handle_guest_abort
2.24% dirty_log_perf_ [kernel.kallsyms] [k] up_read
1.98% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_tlb_flush_vmid_ipa_nsh
1.97% dirty_log_perf_ [kernel.kallsyms] [k] __pi_clear_page
1.30% dirty_log_perf_ [kernel.kallsyms] [k] down_read
1.13% dirty_log_perf_ [kernel.kallsyms] [k] release_pages
1.12% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_pgtable_walk
1.08% dirty_log_perf_ [kernel.kallsyms] [k] folio_batch_move_lru
1.06% dirty_log_perf_ [kernel.kallsyms] [k] __srcu_read_lock
1.03% dirty_log_perf_ [kernel.kallsyms] [k] get_page_from_freelist
1.01% dirty_log_perf_ [kernel.kallsyms] [k] __pte_offset_map_lock
0.82% dirty_log_perf_ [kernel.kallsyms] [k] handle_mm_fault
0.74% dirty_log_perf_ [kernel.kallsyms] [k] mas_state_walk
Colton Lewis (5):
arm: perf: Drop unused functions
perf: Hoist perf_instruction_pointer() and perf_misc_flags()
powerpc: perf: Use perf_arch_instruction_pointer()
x86: perf: Refactor misc flag assignments
perf: Correct perf sampling with guest VMs
arch/arm/include/asm/perf_event.h | 7 ---
arch/arm/kernel/perf_callchain.c | 17 ------
arch/arm64/include/asm/perf_event.h | 4 --
arch/arm64/kernel/perf_callchain.c | 28 ---------
arch/powerpc/include/asm/perf_event_server.h | 6 +-
arch/powerpc/perf/callchain.c | 2 +-
arch/powerpc/perf/callchain_32.c | 2 +-
arch/powerpc/perf/callchain_64.c | 2 +-
arch/powerpc/perf/core-book3s.c | 4 +-
arch/s390/include/asm/perf_event.h | 6 +-
arch/s390/kernel/perf_event.c | 4 +-
arch/x86/events/core.c | 64 +++++++++++++-------
arch/x86/include/asm/perf_event.h | 12 ++--
include/linux/perf_event.h | 26 +++++++-
kernel/events/core.c | 27 ++++++++-
15 files changed, 111 insertions(+), 100 deletions(-)
base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623
--
2.47.0.338.g60cca15819-goog
Powered by blists - more mailing lists