[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ca309fcda71944455a8c6c1b308886ba@kernel.org>
Date: Thu, 17 Sep 2020 12:53:02 +0100
From: Marc Zyngier <maz@...nel.org>
To: Leo Yan <leo.yan@...aro.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>,
John Garry <john.garry@...wei.com>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Namhyung Kim <namhyung@...nel.org>,
Suleiman Souhlal <suleiman@...gle.com>,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCHv3] perf kvm: add kvm-stat for arm64
On 2020-09-17 12:42, Leo Yan wrote:
> On Thu, Sep 17, 2020 at 11:21:15AM +0100, Marc Zyngier wrote:
>
> [...]
>
>> > > > +const char *vcpu_id_str = "id";
>> > >
>> > > On Arm64, ftrace tracepoint "kvm_entry" doesn't contain the field "id"
>> > > or field "vcpu_id", thus it always reads out the "id" is 0 and it is
>> > > recorded into Perf's structure vcpu_event_record::vcpu_id and assigned
>> > > to perf thread's private data "thread::private".
>> > >
>> > > With current code, it will not mess up different vcpus' samples
>> > > because
>> > > now the samples are analyzed based on thread context, but since all
>> > > threads' "vcpu_id" is zero, thus all samples are accounted for
>> > > "vcpu_id=0" and cannot print out correct result with option "--vcpu":
>> > >
>> > >
>> > > $ perf kvm stat report --vcpu 4
>> > >
>> > > Analyze events for all VMs, VCPU 4:
>> > >
>> > > VM-EXIT Samples Samples% Time% Min Time
>> > > Max Time Avg time
>> > >
>> > > Total Samples:0, Total events handled time:0.00us.
>> > >
>> > >
>> > > This is an issue I observed, if we want to support option "--vcpu",
>> > > seems we need to change ftrace event for "kvm_entry", but this will
>> > > break ABI.
>> > >
>> > > Essentially, this issue is caused by different archs using different
>> > > format for ftrace event "kvm_entry", on x86 it contains feild
>> > > "vcpu_id" but arm64 only just records "vcpu_pc".
>> > >
>> > > @Marc, @Will, do you have any suggestion for this? Do you think it's
>> > > feasible to add a new field "vcpu_id" into the tracepoint "kvm_entry"
>> > > for Arm64's version?
>>
>> The question really is: how will you handle the ABI breackage?
>> I don't see a good solution for it, apart from having a *separate*
>> tracepoint that collects all the information you need. And even that
>> is
>> really ugly.
>
> I searched a bit and found in practice it's not impossible to add new
> parameters for existed tracepoint, e.g. [1][2] are two examples to add
> new parameters for existed tracepoints and have been merged into
> mainline kernel. IIUC, we keep the old parameters for a tracepoint
> so this can avoid to break ABI if any apps have used this tracepoint,
> and adding a new parameter for the tracepoint should be safe.
>
> If you agree with this, I'd like to suggest to apply below change.
> How about you think for this?
>
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 46dc3d75cf13..d9f9b8e1df77 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -736,7 +736,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> /**************************************************************
> * Enter the guest
> */
> - trace_kvm_entry(*vcpu_pc(vcpu));
> + trace_kvm_entry(vcpu->vcpu_id, *vcpu_pc(vcpu));
> guest_enter_irqoff();
>
> ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
> index 4691053c5ee4..e1d3e7a67e8b 100644
> --- a/arch/arm64/kvm/trace_arm.h
> +++ b/arch/arm64/kvm/trace_arm.h
> @@ -12,18 +12,20 @@
> * Tracepoints for entry/exit to guest
> */
> TRACE_EVENT(kvm_entry,
> - TP_PROTO(unsigned long vcpu_pc),
> - TP_ARGS(vcpu_pc),
> + TP_PROTO(unsigned int vcpu_id, unsigned long vcpu_pc),
> + TP_ARGS(vcpu_id, vcpu_pc),
>
> TP_STRUCT__entry(
> + __field( unsigned int, vcpu_id )
> __field( unsigned long, vcpu_pc )
> ),
>
> TP_fast_assign(
> + __entry->vcpu_id = vcpu_id;
> __entry->vcpu_pc = vcpu_pc;
> ),
>
> - TP_printk("PC: 0x%08lx", __entry->vcpu_pc)
> + TP_printk("vcpu: %u, PC: 0x%08lx", __entry->vcpu_id,
> __entry->vcpu_pc)
> );
>
> TRACE_EVENT(kvm_exit,
>
How is that not breaking the ABI? You are adding a new field, and
anything
that expect to read 'PC: 0x.....' at the beginning of the line now
fails.
The examples you give are also blatant ABI breakages. because it is done
somewhere else doesn't make it valid.
Anything that can be parsed by userspace is ABI. If you don't believe
me,
please read the entertaining discussion we had when we tried to drop
Bogomips from /proc/cpuinfo.
So unless you get me Linus' stamp of approval for this, it's not
happening.
Feel free to add a *new* tracepoint instead.
M.
--
Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists