[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240904204133.1442132-1-coltonlewis@google.com>
Date: Wed, 4 Sep 2024 20:41:28 +0000
From: Colton Lewis <coltonlewis@...gle.com>
To: kvm@...r.kernel.org
Cc: Oliver Upton <oliver.upton@...ux.dev>, Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, Will Deacon <will@...nel.org>,
Russell King <linux@...linux.org.uk>, Catalin Marinas <catalin.marinas@....com>,
Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>, Naveen N Rao <naveen@...nel.org>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>, Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
Colton Lewis <coltonlewis@...gle.com>
Subject: [PATCH 0/5] Correct perf sampling with guest VMs
This series cleans up perf recording around guest events and improves
the accuracy of the resulting perf reports.
Perf was incorrectly counting any PMU overflow interrupt that occured
while a VCPU was loaded as a guest event even when the events were not
truely guest events. This lead to much less accurate and useful perf
recordings.
See as an example the below reports of `perf record
dirty_log_perf_test -m 2 -v 4` before and after the series on ARM64.
Without series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Overhead Command Shared Object Symbol
54.54% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.39% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
0.89% dirty_log_perf_ [kernel.vmlinux] [k] release_pages
0.70% dirty_log_perf_ [kernel.vmlinux] [k] free_pcppages_bulk
0.62% dirty_log_perf_ dirty_log_perf_test [.] userspace_mem_region_find
0.49% dirty_log_perf_ dirty_log_perf_test [.] sparsebit_is_set
0.46% dirty_log_perf_ dirty_log_perf_test [.] _virt_pg_map
0.46% dirty_log_perf_ dirty_log_perf_test [.] node_add
0.37% dirty_log_perf_ dirty_log_perf_test [.] node_reduce
0.35% dirty_log_perf_ [kernel.vmlinux] [k] free_unref_page_commit
0.33% dirty_log_perf_ [kernel.vmlinux] [k] __kvm_pgtable_walk
0.31% dirty_log_perf_ [kernel.vmlinux] [k] stage2_attr_walker
0.29% dirty_log_perf_ [kernel.vmlinux] [k] unmap_page_range
0.29% dirty_log_perf_ dirty_log_perf_test [.] test_assert
0.26% dirty_log_perf_ [kernel.vmlinux] [k] __mod_memcg_lruvec_state
0.24% dirty_log_perf_ [kernel.vmlinux] [k] kvm_s2_put_page
With series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Samples: 15K of event 'instructions', Event count (approx.): 30898031385
Overhead Command Shared Object Symbol
54.05% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.48% dirty_log_perf_ [kernel.kallsyms] [k] kvm_arch_vcpu_ioctl_run
4.70% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
3.11% dirty_log_perf_ [kernel.kallsyms] [k] kvm_handle_guest_abort
2.24% dirty_log_perf_ [kernel.kallsyms] [k] up_read
1.98% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_tlb_flush_vmid_ipa_nsh
1.97% dirty_log_perf_ [kernel.kallsyms] [k] __pi_clear_page
1.30% dirty_log_perf_ [kernel.kallsyms] [k] down_read
1.13% dirty_log_perf_ [kernel.kallsyms] [k] release_pages
1.12% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_pgtable_walk
1.08% dirty_log_perf_ [kernel.kallsyms] [k] folio_batch_move_lru
1.06% dirty_log_perf_ [kernel.kallsyms] [k] __srcu_read_lock
1.03% dirty_log_perf_ [kernel.kallsyms] [k] get_page_from_freelist
1.01% dirty_log_perf_ [kernel.kallsyms] [k] __pte_offset_map_lock
0.82% dirty_log_perf_ [kernel.kallsyms] [k] handle_mm_fault
0.74% dirty_log_perf_ [kernel.kallsyms] [k] mas_state_walk
Colton Lewis (5):
arm: perf: Drop unused functions
perf: Hoist perf_instruction_pointer() and perf_misc_flags()
powerpc: perf: Use perf_arch_instruction_pointer()
x86: perf: Refactor misc flag assignments
perf: Correct perf sampling with guest VMs
arch/arm/include/asm/perf_event.h | 7 ---
arch/arm/kernel/perf_callchain.c | 17 -------
arch/arm64/include/asm/perf_event.h | 4 --
arch/arm64/kernel/perf_callchain.c | 28 ------------
arch/powerpc/include/asm/perf_event_server.h | 6 +--
arch/powerpc/perf/callchain.c | 2 +-
arch/powerpc/perf/callchain_32.c | 2 +-
arch/powerpc/perf/callchain_64.c | 2 +-
arch/powerpc/perf/core-book3s.c | 4 +-
arch/s390/include/asm/perf_event.h | 6 +--
arch/s390/kernel/perf_event.c | 4 +-
arch/x86/events/core.c | 47 +++++++++++---------
arch/x86/include/asm/perf_event.h | 12 ++---
include/linux/perf_event.h | 26 +++++++++--
kernel/events/core.c | 27 ++++++++++-
15 files changed, 95 insertions(+), 99 deletions(-)
--
2.46.0.469.g59c65b2a67-goog
Powered by blists - more mailing lists