[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250226134813.GC1821331@e132581.arm.com>
Date: Wed, 26 Feb 2025 13:48:13 +0000
From: Leo Yan <leo.yan@....com>
To: Mark Rutland <mark.rutland@....com>
Cc: Rob Herring <robh@...nel.org>, Will Deacon <will@...nel.org>,
Catalin Marinas <catalin.marinas@....com>,
Jonathan Corbet <corbet@....net>, Marc Zyngier <maz@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>,
Joey Gouly <joey.gouly@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Zenghui Yu <yuzenghui@...wei.com>,
James Clark <james.clark@...aro.org>,
Anshuman Khandual <anshuman.khandual@....com>,
linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v20 11/11] perf: arm_pmuv3: Add support for the Branch
Record Buffer Extension (BRBE)
On Tue, Feb 25, 2025 at 07:58:18PM +0000, Mark Rutland wrote:
> On Tue, Feb 25, 2025 at 05:48:03PM +0000, Leo Yan wrote:
> > On Tue, Feb 25, 2025 at 12:01:52PM +0000, Mark Rutland wrote:
> >
> > [...]
> >
> > > > > Critically, the brbe_enable() function merges the filters of all
> > > > > *active* events which have been installed into hardware. It does not
> > > > > track all events which can be rotated, and the resulting filter is not
> > > > > the same -- it can change as a result of rotation.
> > > >
> > > > In a perf session has multiple events, and events have different branch
> > > > filters, seems to me, a simple way is to return error for this case.
> > >
> > > FWIW, I'd generally prefer to do that since it avoids a number of
> > > horrible edge-cases and gets rid of the need to do SW filtering, which
> > > falls somewhere between "tricky" and "not entirely possible". However,
> > > that's not what LBR and others do, which is why we went with filter
> > > merging.
> > >
> > > If folk on the tools side are happy with the kernel rejecting
> > > conflicting events, then I'd be more than happy to do that. What I don't
> > > want is that we start off with that approach and people immediately
> > > start to complain that the BRBE driver rejects events that the LBR
> > > driver accepts.
> > >
> > > See the last time this came up.
> >
> > Thanks for the shared links. Based on the info, let's say we can have two
> > cases:
> >
> > Case 1: set different branch filters in a single perf session:
> >
> > perf record -e armv8_pmuv3_0/r03,branch_type=any_call/u \
> > -e armv8_pmuv3_0/r04,branch_type=any_ret/k ...
> >
> > Case 2: set different branch filters in multiple perf sessions:
> >
> > perf record -e armv8_pmuv3_0/r03,branch_type=any_call/u ...
> >
> > perf record -e armv8_pmuv3_0/r04,branch_type=any_ret/k ...
> >
> > In my previous reply, I was suggesting that we should reject the case 1.
>
> Do you mean that the kernel should reject that, or that userspace should
> reject that.
For the case 1, the userspace can reject it as it has sufficient info to
detect mismatched filters by iterating event list.
> As mentioned earlier, I am ok with the idea that we reject *scheduling*
> events with mismatched filters, as we do for other resource conflicts.
> That would necessarily mean rejecting *groups* of events with
> inconsistent filters at open time.
This rejects the case 2 in the PMU driver.
> However, I do not think that we should reject indepenent events which
> happen to have mismatched filters, regardless of whether they're opened
> by the same "session".
Can I understand "independent events" are events that are not in a event
group?
If independent events can have mismatched filters, the PMU driver can
activate them simultaneously, this means the BRBE driver needs to merge
the branch filters. If so, I still see the complexity caused by this
case.
> > IMO, it is not quite useful to configure different filters for events in
> > the same session, especially if this leads complexity in the driver due
> > to the hardware limitation.
>
> I generally agree, but IIRC userspace does this today.
>
> > For case 2, when create a new session, if the perf tool can read out the
> > current branch filter setting (e.g. via sysfs node) and give suggestion
> > what branch filter is compabile with existed sessions, seems to me, this
> > is a feasible solution. My understanding this is a rare case, and a
> > clear guidance for users would be sufficient if this happens. (Maybe
> > we can give recommendation for how to use BRBE in the perf doc).
>
> No. We are not going to expose *dynamic* information about the PMU
> hardware via sysfs. This is not as simple as you make it out to be.
> At any point in time there can be an arbitrary number of events open,
> and some arbitrary subset currently scheduled.
If so, the perf tool will miss info for checking the existed branch
filter and defer to handle the mismatched filter error until the
kernel reports the issue.
> I agree that ideally this should be rare, though, and hence either of
> the two options I have suggested thus far should handle that acceptably.
>
> > To be clear, an important factor is the trace modes with modifier 'u'
> > (user) and 'k' (kernel) should be supported for different events and for
> > different sessions. In a mixed cases (some events are userspace only
> > and some are kernel only), the BRBE driver needs to filter out branch
> > records for specific mode when taking a sample.
>
> I hate to repeat myself, but the driver has *no concept whatsoever* of a
> "session". It only knows:
>
> * Which events are currently active in hardware.
>
> * Which events have been grouped together.
>
> * Which events have been opened for a given task or CPU context.
>
> ... and *NONE* of those correspond directly to a "session" managed by
> the userspace perf tool.
Sorry for that. After your reminder, I understood that the PMU driver
has no knowledge for perf session.
The description above does not refer to perf session. As you said, the
PMU driver has its own context to track active events, and it is
possible that some active events might trace only user mode while others
trace only kernel mode. In this case, BRBE needs to be enabled for both
user and kernel modes. When capturing samples, the BRBE driver needs to
filter out branch records based on a specific event for its
corresponding mode.
For better understanding, though we don't allow mismatched filters, I
still think we should set both E0BRE and ExBRE bits if there are
multiple active events with different modes.
[...]
> > I haven't done any experiments for this. If you think this is the way
> > to move forward, I might do a prototype and get back to you to ensure we
> > don't run into any unexpected issues.
>
> What specifically are you proposing to prototype?
I mean, in Linux perf, we can add code to check branch filters for
opened events when creating a session and report an error for
mismatching filters. I would suggest we implement it specifically
for Arm64 to avoid dispute with other archs.
Thanks,
Leo
Powered by blists - more mailing lists