linux-kernel - Re: [PATCHv3] perf kvm: add kvm-stat for arm64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200917125246.GF12548@leoy-ThinkPad-X240s>
Date:   Thu, 17 Sep 2020 20:52:46 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     Marc Zyngier <maz@...nel.org>
Cc:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>,
        John Garry <john.garry@...wei.com>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Suleiman Souhlal <suleiman@...gle.com>,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCHv3] perf kvm: add kvm-stat for arm64

On Thu, Sep 17, 2020 at 12:53:02PM +0100, Marc Zyngier wrote:
> On 2020-09-17 12:42, Leo Yan wrote:
> > On Thu, Sep 17, 2020 at 11:21:15AM +0100, Marc Zyngier wrote:
> > 
> > [...]
> > 
> > > > > > +const char *vcpu_id_str = "id";
> > > > >
> > > > > On Arm64, ftrace tracepoint "kvm_entry" doesn't contain the field "id"
> > > > > or field "vcpu_id", thus it always reads out the "id" is 0 and it is
> > > > > recorded into Perf's structure vcpu_event_record::vcpu_id and assigned
> > > > > to perf thread's private data "thread::private".
> > > > >
> > > > > With current code, it will not mess up different vcpus' samples
> > > > > because
> > > > > now the samples are analyzed based on thread context, but since all
> > > > > threads' "vcpu_id" is zero, thus all samples are accounted for
> > > > > "vcpu_id=0" and cannot print out correct result with option "--vcpu":
> > > > >
> > > > >
> > > > >   $ perf kvm stat report --vcpu 4
> > > > >
> > > > >   Analyze events for all VMs, VCPU 4:
> > > > >
> > > > >              VM-EXIT    Samples  Samples%     Time%    Min Time
> > > > > Max Time         Avg time
> > > > >
> > > > >   Total Samples:0, Total events handled time:0.00us.
> > > > >
> > > > >
> > > > > This is an issue I observed, if we want to support option "--vcpu",
> > > > > seems we need to change ftrace event for "kvm_entry", but this will
> > > > > break ABI.
> > > > >
> > > > > Essentially, this issue is caused by different archs using different
> > > > > format for ftrace event "kvm_entry", on x86 it contains feild
> > > > > "vcpu_id" but arm64 only just records "vcpu_pc".
> > > > >
> > > > > @Marc, @Will, do you have any suggestion for this?  Do you think it's
> > > > > feasible to add a new field "vcpu_id" into the tracepoint "kvm_entry"
> > > > > for Arm64's version?
> > > 
> > > The question really is: how will you handle the ABI breackage?
> > > I don't see a good solution for it, apart from having a *separate*
> > > tracepoint that collects all the information you need. And even that
> > > is
> > > really ugly.
> > 
> > I searched a bit and found in practice it's not impossible to add new
> > parameters for existed tracepoint, e.g. [1][2] are two examples to add
> > new parameters for existed tracepoints and have been merged into
> > mainline kernel.  IIUC, we keep the old parameters for a tracepoint
> > so this can avoid to break ABI if any apps have used this tracepoint,
> > and adding a new parameter for the tracepoint should be safe.
> > 
> > If you agree with this, I'd like to suggest to apply below change.
> > How about you think for this?
> > 
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 46dc3d75cf13..d9f9b8e1df77 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -736,7 +736,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  		/**************************************************************
> >  		 * Enter the guest
> >  		 */
> > -		trace_kvm_entry(*vcpu_pc(vcpu));
> > +		trace_kvm_entry(vcpu->vcpu_id, *vcpu_pc(vcpu));
> >  		guest_enter_irqoff();
> > 
> >  		ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> > diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
> > index 4691053c5ee4..e1d3e7a67e8b 100644
> > --- a/arch/arm64/kvm/trace_arm.h
> > +++ b/arch/arm64/kvm/trace_arm.h
> > @@ -12,18 +12,20 @@
> >   * Tracepoints for entry/exit to guest
> >   */
> >  TRACE_EVENT(kvm_entry,
> > -	TP_PROTO(unsigned long vcpu_pc),
> > -	TP_ARGS(vcpu_pc),
> > +	TP_PROTO(unsigned int vcpu_id, unsigned long vcpu_pc),
> > +	TP_ARGS(vcpu_id, vcpu_pc),
> > 
> >  	TP_STRUCT__entry(
> > +		__field(	unsigned int,	vcpu_id		)
> >  		__field(	unsigned long,	vcpu_pc		)
> >  	),
> > 
> >  	TP_fast_assign(
> > +		__entry->vcpu_id		= vcpu_id;
> >  		__entry->vcpu_pc		= vcpu_pc;
> >  	),
> > 
> > -	TP_printk("PC: 0x%08lx", __entry->vcpu_pc)
> > +	TP_printk("vcpu: %u, PC: 0x%08lx", __entry->vcpu_id, __entry->vcpu_pc)
> >  );
> > 
> >  TRACE_EVENT(kvm_exit,
> > 
> 
> How is that not breaking the ABI? You are adding a new field, and anything
> that expect to read 'PC: 0x.....' at the beginning of the line now fails.
> The examples you give are also blatant ABI breakages. because it is done
> somewhere else doesn't make it valid.
> 
> Anything that can be parsed by userspace is ABI. If you don't believe me,
> please read the entertaining discussion we had when we tried to drop
> Bogomips from /proc/cpuinfo.

The discussion thread was too long [1] to read all replies :)

... but I understand we should be very careful for ABI breakage.

> So unless you get me Linus' stamp of approval for this, it's not happening.
> Feel free to add a *new* tracepoint instead.

Okay, thanks for the info and suggestions.

Thanks,
Leo

[1] https://lkml.org/lkml/2015/1/4/132