lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7yY19UtSnND5KTl@J2N7QTR9R3.cambridge.arm.com>
Date: Mon, 24 Feb 2025 16:05:43 +0000
From: Mark Rutland <mark.rutland@....com>
To: Leo Yan <leo.yan@....com>
Cc: Rob Herring <robh@...nel.org>, Will Deacon <will@...nel.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Jonathan Corbet <corbet@....net>, Marc Zyngier <maz@...nel.org>,
	Oliver Upton <oliver.upton@...ux.dev>,
	Joey Gouly <joey.gouly@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Zenghui Yu <yuzenghui@...wei.com>,
	James Clark <james.clark@...aro.org>,
	Anshuman Khandual <anshuman.khandual@....com>,
	linux-arm-kernel@...ts.infradead.org,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v20 11/11] perf: arm_pmuv3: Add support for the Branch
 Record Buffer Extension (BRBE)

On Mon, Feb 24, 2025 at 02:03:17PM +0000, Leo Yan wrote:
> On Mon, Feb 24, 2025 at 06:46:35AM -0600, Rob Herring wrote:
> > On Mon, Feb 24, 2025 at 6:25 AM Leo Yan <leo.yan@....com> wrote:
> > > On Tue, Feb 18, 2025 at 02:40:06PM -0600, Rob Herring (Arm) wrote:
> > > >
> > > > From: Anshuman Khandual <anshuman.khandual@....com>
> > >
> > > [...]
> > >
> > > > BRBE records are invalidated whenever events are reconfigured, a new
> > > > task is scheduled in, or after recording is paused (and the records
> > > > have been recorded for the event). The architecture allows branch
> > > > records to be invalidated by the PE under implementation defined
> > > > conditions. It is expected that these conditions are rare.
> > >
> > > [...]
> > >
> > > > +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
> > > > +{
> > > > +       struct arm_pmu *armpmu = *this_cpu_ptr(&cpu_armpmu);
> > > > +       struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> > > > +
> > > > +       if (!hw_events->branch_users)
> > > > +               return;
> > > > +
> > > > +       if (sched_in)
> > > > +               brbe_invalidate();
> > > > +}
> > >
> > > Just a minor concern.  I don't see any handling for task migration.
> > > E.g., for a task is migrated from one CPU to another CPU, I expect we
> > > need to save and restore branch records based on BRBE injection.  So
> > > far, the driver simply invalidates all records.
> > >
> > > I think this topic is very likely discussed before.  If this is the
> > > case, please ignore my comment.  Except this, the code looks good
> > > to me.
> > 
> > Not really discussed on the list, but that was present in v18 (though
> > not functional because .sched_task() hook wasn't actually enabled) and
> > Mark removed it. His work is here[1].The only comment was:
> > 
> > Note: saving/restoring at context-switch doesn't interact well with
> > event rotation (e.g. if filters change)
> 
> In the brbe_enable() function, it "Merge the permitted branch filters
> of all events".  Based on current implementation, all events share the
> same branch filter.

Critically, the brbe_enable() function merges the filters of all
*active* events which have been installed into hardware. It does not
track all events which can be rotated, and the resulting filter is not
the same -- it can change as a result of rotation.

> When event rotation happens, if without context switch, in theory we
> should can directly use the branch record (no invalidation, no injection)
> for all events.

No; that only works in *some* cases, and will produce incorrect results
in others.

For example, consider filtering. Imagine a PMU with a single counter,
and two events, where event-A filters for calls-and-returns and event-B
filters for calls-only. When switching from event-A to event-B, it's
theoretically possible to keep the existing records around, knowing that
the returns can be filtered out later. When switching from event-B to
event-A we cannot keep the existing records, since there are gaps
whenever a return should have been recorded.

There are a number of cases of that shape given the set of configurable
filters. In theory it's possible to retain those in some cases, but I
don't think that the complexity is justified.

Similarly, whenever kernel branches are recorded it's necessary to drop
the stale branches whenever branch recording is paused, as there's
necessarily a blackout period and hence a gap in the records.

Do you think that you have a case where losing branches across rotation
*really* matters?

> For a context-switch case, we need to save and re-inject branch record.
> BRBE record sticks to a process context, no matter what events have been
> enabled.

I had originally wanted to keep per-event records around, but it doesn't
work in all cases. One reason events get discarded at context-switch
time is that CPU-bound events can sample branches, and would
mis-attribute stale userspace branches to the wrong context when
switching tasks. There are explicit comments about this in
amd_pmu_brs_sched_task() and intel_pmu_lbr_sched_task().

Given we discard records when reprogramming events, we *could* try to
preserve events in some cases, but I suspect that as with the rotation
case this'll be a lot of complexity for little gain. Note that as we
discard events when enabling the PMU, we'd throw some task-bound records
away anyway, and practically the gain would be limited to cpu-bound
records.

Do you have a reason why you think we *must* keep events around?

Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ