lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <SYBPR01MB687069BFC9744585B4EEF8C49D88A@SYBPR01MB6870.ausprd01.prod.outlook.com>
Date:   Sun, 10 Dec 2023 16:07:42 +0800
From:   Tianyi Liu <i.pear@...look.com>
To:     seanjc@...gle.com, pbonzini@...hat.com, peterz@...radead.org,
        mingo@...hat.com, acme@...nel.org
Cc:     linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
        linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        kvm@...r.kernel.org, x86@...nel.org, mark.rutland@....com,
        mlevitsk@...hat.com, maz@...nel.org,
        alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
        namhyung@...nel.org, irogers@...gle.com, adrian.hunter@...el.com,
        Tianyi Liu <i.pear@...look.com>
Subject: [PATCH v3 0/5] perf: KVM: Enable callchains for guests

This series of patches enables callchains for guests (used by `perf kvm`),
which holds the top spot on the perf wiki TODO list [1]. This allows users
to perform guest OS callchain or performance analysis from external
using PMU events. This is also useful for guests like unikernels that
lack performance event subsystems.

The event processing flow is as follows (shown as backtrace):
@0 kvm_arch_vcpu_get_unwind_info / kvm_arch_vcpu_read_virt (per arch impl)
@1 kvm_guest_get_unwind_info / kvm_guest_read_virt
   <callback function pointers in `struct perf_guest_info_callbacks`>
@2 perf_guest_get_unwind_info / perf_guest_read_virt
@3 perf_callchain_guest
@4 get_perf_callchain
@5 perf_callchain

Between @0 and @1 is the interface between KVM and the arch-specific
impl, while between @1 and @2 is the interface between Perf and KVM.
The 1st patch implements @0. The 2nd patch extends interfaces between @1
and @2, while the 3rd patch implements @1. The 4th patch implements @3
and modifies @4 @5. The last patch is for userspace tools.

Since arm64 hasn't provided some foundational infrastructure (interface
for reading from a virtual address of guest), the arm64 implementation
is stubbed for now because it's a bit complex, and will be implemented
later.

For safety, guests are designed to be read-only in this feature,
and we will never inject page faults into the guests, ensuring that the
guests are not interfered by profiling. In extremely rare cases, if the
guest is modifying the page table, there is a possibility of reading
incorrect data. Additionally, if certain programs running in the guest OS
do not support frame pointers, it may also result in some erroneous data.
These erroneous data will eventually appear as `[unknown]` entries in the
report. It is sufficient as long as most of the records are correct for
profiling.

Regarding the necessity of implementing in the kernel:
Indeed, we could implement this in userspace and access the guest vm
through the KVM APIs, to interrupt the guest and perform unwinding.
However, this approach will introduce higher latency, and the overhead of
syscalls could limit the sampling frequency. Moreover, it appears that
user space can only interrupt the VCPU at a certain frequency, without
fully leveraging the richness of the PMU's performance events. On the
other hand, if we incorporate the logic into kernel, `perf kvm` can bind
to various PMU events and achieve faster performance in PMU interrupts.

Tested with both Linux and unikernels as guests, the `perf script` command
could correctly show the callchains.
FlameGraphs could also be generated with this series of patches and [2].

[1] https://perf.wiki.kernel.org/index.php/Todo
[2] https://github.com/brendangregg/FlameGraph

v1:
https://lore.kernel.org/kvm/SYYP282MB108686A73C0F896D90D246569DE5A@SYYP282MB1086.AUSP282.PROD.OUTLOOK.COM/

Changes since v1:
Post the complete implementation, also updated some code based on
Sean's feedback.

v2:
https://lore.kernel.org/kvm/SY4P282MB1084ECBCC1B176153B9E2A009DCFA@SY4P282MB1084.AUSP282.PROD.OUTLOOK.COM/

Changes since v2:
Refactored interface, packaged the info required by unwinding into
a struct; Resolved some type mismatches; Provided more explanations
based on the feedback from v2; more tests were performed.

Tianyi Liu (5):
  KVM: Add arch specific interfaces for sampling guest callchains
  perf kvm: Introduce guest interfaces for sampling callchains
  KVM: implement new perf callback interfaces
  perf kvm: Support sampling guest callchains
  perf tools: Support PERF_CONTEXT_GUEST_* flags

 MAINTAINERS                         |  1 +
 arch/arm64/kvm/arm.c                | 12 ++++++
 arch/x86/events/core.c              | 63 ++++++++++++++++++++++++-----
 arch/x86/kvm/x86.c                  | 24 +++++++++++
 include/linux/kvm_host.h            |  5 +++
 include/linux/perf_event.h          | 20 ++++++++-
 include/linux/perf_kvm.h            | 18 +++++++++
 kernel/bpf/stackmap.c               |  8 ++--
 kernel/events/callchain.c           | 27 ++++++++++++-
 kernel/events/core.c                | 17 +++++++-
 tools/perf/builtin-timechart.c      |  6 +++
 tools/perf/util/data-convert-json.c |  6 +++
 tools/perf/util/machine.c           |  6 +++
 virt/kvm/kvm_main.c                 | 22 ++++++++++
 14 files changed, 218 insertions(+), 17 deletions(-)
 create mode 100644 include/linux/perf_kvm.h


base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ