[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YiIGbbyx0uimsGN4@hirez.programming.kicks-ass.net>
Date: Fri, 4 Mar 2022 13:30:37 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Adrian Hunter <adrian.hunter@...el.com>
Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>, linux-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
kvm@...r.kernel.org, H Peter Anvin <hpa@...or.com>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Suzuki K Poulose <suzuki.poulose@....com>,
Leo Yan <leo.yan@...aro.org>
Subject: Re: [PATCH V2 02/11] perf/x86: Add support for TSC as a perf event
clock
On Mon, Feb 14, 2022 at 01:09:05PM +0200, Adrian Hunter wrote:
> Currently, using Intel PT to trace a VM guest is limited to kernel space
> because decoding requires side band events such as MMAP and CONTEXT_SWITCH.
> While these events can be collected for the host, there is not a way to do
> that yet for a guest. One approach, would be to collect them inside the
> guest, but that would require being able to synchronize with host
> timestamps.
>
> The motivation for this patch is to provide a clock that can be used within
> a VM guest, and that correlates to a VM host clock. In the case of TSC, if
> the hypervisor leaves rdtsc alone, the TSC value will be subject only to
> the VMCS TSC Offset and Scaling. Adjusting for that would make it possible
> to inject events from a guest perf.data file, into a host perf.data file.
>
> Thus making possible the collection of VM guest side band for Intel PT
> decoding.
>
> There are other potential benefits of TSC as a perf event clock:
> - ability to work directly with TSC
> - ability to inject non-Intel-PT-related events from a guest
>
> Signed-off-by: Adrian Hunter <adrian.hunter@...el.com>
> ---
> arch/x86/events/core.c | 16 +++++++++
> arch/x86/include/asm/perf_event.h | 3 ++
> include/uapi/linux/perf_event.h | 12 ++++++-
> kernel/events/core.c | 57 +++++++++++++++++++------------
> 4 files changed, 65 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index e686c5e0537b..51d5345de30a 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2728,6 +2728,17 @@ void arch_perf_update_userpage(struct perf_event *event,
> !!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT);
> userpg->pmc_width = x86_pmu.cntval_bits;
>
> + if (event->attr.use_clockid &&
> + event->attr.ns_clockid &&
> + event->attr.clockid == CLOCK_PERF_HW_CLOCK) {
> + userpg->cap_user_time_zero = 1;
> + userpg->time_mult = 1;
> + userpg->time_shift = 0;
> + userpg->time_offset = 0;
> + userpg->time_zero = 0;
> + return;
> + }
> +
> if (!using_native_sched_clock() || !sched_clock_stable())
> return;
This looks the wrong way around. If TSC is found unstable, we should
never expose it.
And I'm not at all sure about the whole virt thing. Last time I looked
at pvclock it made no sense at all.
Powered by blists - more mailing lists