[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20211019152831.GB4938@lakrids.cambridge.arm.com>
Date: Tue, 19 Oct 2021 16:28:31 +0100
From: Mark Rutland <mark.rutland@....com>
To: Rob Herring <robh@...nel.org>
Cc: Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Catalin Marinas <catalin.marinas@....com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Honnappa Nagarahalli <honnappa.nagarahalli@....com>,
Zachary.Leaf@....com, Raphael Gault <raphael.gault@....com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Namhyung Kim <namhyung@...nel.org>,
Itaru Kitayama <itaru.kitayama@...il.com>,
Vince Weaver <vincent.weaver@...ne.edu>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v10 4/5] arm64: perf: Enable PMU counter userspace access
for perf event
Hi Rob,
On Thu, Oct 14, 2021 at 02:24:46PM -0500, Rob Herring wrote:
> On Thu, Oct 14, 2021 at 11:58 AM Mark Rutland <mark.rutland@....com> wrote:
> > On Tue, Sep 14, 2021 at 03:47:59PM -0500, Rob Herring wrote:
> > For the `config1 = 3` case (potentially) overriding the usual long
> > semantic, I'm struggling to understand why we need that rather than
> > forcing the use of a 64-bit counter, because in that case:
> >
> > * For a CPU_CYCLES event:
> > __armv8_pmuv3_map_event() will always pick 64-bits
> > get_event_idx() may fail to allocate a 64-bit counter.
> >
> > * For other events:
> > __armv8_pmuv3_map_event() will pick 32/64 based on long counter
> > support
> > get_event_idx() will only fail if there are no counters free.
> >
> > Whereas if __armv8_pmuv3_map_event() returned an error for the latter
> > when long counter support is not implemented, we'd have consistent
> > `long` semantics, and the CPU_CYCLES behaviour would be identical.
> >
> > What's the rationale for `3` leaving the choice to the kernel?
>
> It's the give me the maximum sized counter the h/w can support choice.
> That's easier for userspace to implement. Bit 1 is more of a hint that
> the user wants userspace access rather than a requirement.
>
> > If the problem is discoverability, I'd be happy to add something to
> > sysfs to describe whether the PMU has long event support.
>
> Checking sysfs or a try for 64-bit support then fall back to 32-bit
> support isn't much difference.
>
> Keep in mind that x86 always succeeds here. Every userspace user will
> have to add whatever dance we create here. For example, each libperf
> test with user access (there's only 2 in my tree, but there's a series
> adding more) has to have an '#ifdef __aarch64__' for whatever we do
> here. I was seeking to minimize that. Right now, that's just a set
> config1 to 0x3. Also, note that libperf will opportunistically use a
> userspace read instead of read(). The user just has to mmap the event
> and libperf will use a userspace read when enabled which ultimately
> depends on what the mmapped page says.
I think that x86 always succeeding here is more of a legacy thing that
they're stuck with rather than a design to be copied.
I'd prefer to keep the existing meaning of the `long` flag to mean "give
me 64 bits of counter, somehow", with `rdpmc` meaning "give me a single
counter I can access from userspace", even if that means the combination
of the two can sometimes be rejected. As you say, we can probe for that
as necessary by trying `long` then falling back to a plain event, and if
that ends up being a bottleneck somehow we can figure out a way of
advertising support to userspace. Regardless, we should
Importantly, I don't think libperf should override a user's request for
`long`, since the user may want to optimize for minimal perturbation
rather than faster access.
If we want a "please give me the longest counter that's compatible with
other constraints", I think that should be a new flag e.g. `trylong`,
and shouldn't override the existing `long`. We can add that as a
follow-up if we want it.
Thanks,
Mark.
Powered by blists - more mailing lists