[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SY4P282MB1084293EB22AB6FBF72DA0599DCCA@SY4P282MB1084.AUSP282.PROD.OUTLOOK.COM>
Date: Wed, 11 Oct 2023 22:44:55 +0800
From: Tianyi Liu <i.pear@...look.com>
To: mlevitsk@...hat.com
Cc: acme@...nel.org, adrian.hunter@...el.com,
alexander.shishkin@...ux.intel.com, i.pear@...look.com,
irogers@...gle.com, jolsa@...nel.org, kvm@...r.kernel.org,
kvmarm@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
mark.rutland@....com, mingo@...hat.com, namhyung@...nel.org,
pbonzini@...hat.com, peterz@...radead.org, seanjc@...gle.com,
x86@...nel.org
Subject: Re: [PATCH v2 4/5] perf kvm: Support sampling guest callchains
Hi Maxim,
At 2023-10-10 16:12 +0000, Maxim Levitsky wrote:
> > +static inline void
> > +perf_callchain_guest32(struct perf_callchain_entry_ctx *entry)
> > +{
> > + struct stack_frame_ia32 frame;
> > + const struct stack_frame_ia32 *fp;
> > +
> > + fp = (void *)perf_guest_get_frame_pointer();
> > + while (fp && entry->nr < entry->max_stack) {
> > + if (!perf_guest_read_virt(&fp->next_frame, &frame.next_frame,
> This should be fp->next_frame.
> > + sizeof(frame.next_frame)))
> > + break;
> > + if (!perf_guest_read_virt(&fp->return_address, &frame.return_address,
> Same here.
> > + sizeof(frame.return_address)))
> > + break;
> > + perf_callchain_store(entry, frame.return_address);
> > + fp = (void *)frame.next_frame;
> > + }
> > +}
> > +
The address space where `fp` resides here is in the guest memory, not in
the directly accessible kernel address space. `&fp->next_frame` and
`&fp->return_address` are simply calculating address offsets in a more
readable manner, much like `fp + 0` and `fp + 4`.
The original implementation of `perf_callchain_user` and
`perf_callchain_user32` also use this approach [1].
>
> For symmetry, maybe it makes sense to have perf_callchain_guest32 and perf_callchain_guest64
> and then make perf_callchain_guest call each? No strong opinion on this of course.
>
The `perf_callchain_guest` and `perf_callchain_guest32` here are simply
designed to mimic `perf_callchain_user` and `perf_callchain_user32` [2].
I'm also open to make the logic fully separate, if this doesn't seem
elegant enough.
[1] https://github.com/torvalds/linux/blob/master/arch/x86/events/core.c#L2890
[2] https://github.com/torvalds/linux/blob/master/arch/x86/events/core.c#L2820
Best regards,
Tianyi Liu
Powered by blists - more mailing lists