[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62ff8c83-e147-451f-8d08-44a6512e0f2e@linaro.org>
Date: Wed, 5 Feb 2025 14:51:17 +0000
From: James Clark <james.clark@...aro.org>
To: Rob Herring <robh@...nel.org>
Cc: linux-arm-kernel@...ts.infradead.org, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
kvmarm@...ts.linux.dev, Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Catalin Marinas <catalin.marinas@....com>, Jonathan Corbet <corbet@....net>,
Marc Zyngier <maz@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
Joey Gouly <joey.gouly@....com>, Suzuki K Poulose <suzuki.poulose@....com>,
Zenghui Yu <yuzenghui@...wei.com>,
Anshuman Khandual <anshuman.khandual@....com>
Subject: Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch
Record Buffer Extension (BRBE)
On 05/02/2025 2:38 pm, James Clark wrote:
>
>
> On 04/02/2025 3:03 pm, Rob Herring wrote:
>> On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@...aro.org>
>> wrote:
>>>
>>>
>>>
>>> On 03/02/2025 5:58 pm, Rob Herring wrote:
>>>> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@...aro.org>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
>>>>>> From: Anshuman Khandual <anshuman.khandual@....com>
>>>>>>
>>>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>>>>> Extension (BRBE), which records information about branches as they
>>>>>> are
>>>>>> executed into set of branch record registers. BRBE is similar to
>>>>>> x86's
>>>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>>>>> (BHRB).
>>
>> [...]
>>
>>>>>> + /*
>>>>>> + * Require that the event filter and branch filter
>>>>>> permissions match.
>>>>>> + *
>>>>>> + * The event and branch permissions can only mismatch if the
>>>>>> user set
>>>>>> + * at least one of the privilege branch filters in
>>>>>> PERF_SAMPLE_BRANCH_PLM_ALL.
>>>>>> + * Otherwise, the core will set the branch sample
>>>>>> permissions in
>>>>>> + * perf_copy_attr().
>>>>>> + */
>>>>>> + if ((event->attr.exclude_user != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_USER)) ||
>>>>>> + (event->attr.exclude_kernel != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_KERNEL)) ||
>>>>>
>>>>> I don't think this one is right. By default perf_copy_attr() copies
>>>>> the
>>>>> exclude_ settings into the branch settings so this works, but if the
>>>>> user sets any _less_ permissive branch setting this fails. For
>>>>> example:
>>>>>
>>>>> # perf record -j any,u -- true
>>>>> Error:
>>>>> cycles:PH: PMU Hardware or event type doesn't support branch stack
>>>>> sampling.
>>>>>
>>>>> Here we want the default sampling permissions (exclude_kernel == 0,
>>>>> exclude_user == 0), but only user branch records, which doesn't match.
>>>>> It should be allowed because it doesn't include anything that we're
>>>>> not
>>>>> allowed to see.
>>>>
>>>> I know it is allowed(on x86), but why would we want that? If you do
>>>> something even more restricted:
>>>>
>>>> perf record -e cycles:k -j any,u -- true
>>>>
>>>> That's allowed on x86 and gives you samples with user addresses. But
>>>> all the events happened in the kernel. How does that make any sense?
>>>>
>>>> I suppose in your example, we could avoid attaching branch stack on
>>>> samples from the kernel. However, given how my example works, I'm
>>>> pretty sure that's not what x86 does.
>>>>
>>>> There's also combinations that are allowed, but record no samples.
>>>> Though I think that was with guest events. I've gone with reject
>>>> non-sense combinations as much as possible. We can easily remove those
>>>> restrictions later if needed. Changing the behavior later (for the
>>>> same configuration) wouldn't be good.
>>>>
>>>>
>>>
>>> Rejecting ones that produce no samples is fair enough, but my example
>>> does produce samples. To answer the question "why would we want that?",
>>> nothing major, but there are a few small reasons:
>>>
>>> * Perf includes both user and kernel by default, so the shortest
>>> command to only gather user branches doesn't work (-j any,u)
>>> * The test already checks for branch stack support like this, so old
>>> Perf test versions don't work
>>
>> I would be more concerned about this one except that *we* wrote that
>> test. (I'm not sure why we wrote a new test rather than adapting
>> record_lbr.sh...)
>>
>
> record_lbr.sh was added 6 months ago, test_brstack.sh 3 years ago so
> it's the other way around.
>
> Although record_lbr.sh also tests --call-graph and --stitch-lbr as well,
> so I think it's fine for test_brstack.sh to test only --branch-filter
> options at the lowest level.
>
> Looking at that test though I see there is a capability "/sys/devices/
> cpu/caps/branches". I'm wondering whether we should be adding that on
> the Arm PMU for BRBE?
>
> Ignoring the tests, the man pages (and some pages on the internet) give
> this example: "--branch-filter any_ret,u,k". This doesn't work either
> because it doesn't match the default exclude_hv option. It just seems a
> bit awkward and incompatible to me, for not much gain.
>
Looking at record_lbr.sh led me to the fact that --call-graph=lbr sets
"PERF_SAMPLE_BRANCH_USER" with the default kernel/user sampling mode,
causing the same issue.
>>> * You might only be optimising userspace, but still interested in the
>>> proportion of time spent or particular place in the kernel
>>
>> How do you see that? It looks completely misleading to me. 'perf
>> report' seems to only list branch stack addresses in this case. There
>> doesn't seem to be any matching of the event address to branch stack
>> addresses.
>>
>
> Perf script will show everything with all it's various options, or --
> branch-history on perf report will show both too. Also there are tools
> other than Perf, AutoFDO seems like something that BRBE can be used with.
>
>>> * Consistency with existing implementations and for people porting
>>> existing tools to Arm
>>> * It doesn't cost anything to support it (I think we just
>>> only check if exclude_* is set rather than !=)
>>> * Permissions checks should be handled by the core code so that
>>> they're consistent
>>> * What's the point of separate branch filters anyway if they always
>>> have to match the event filter?
>>
>> IDK, I wish someone could tell me. I don't see the usecase for them
>> being mismatched.
>>
>> In any case, I don't care too much one way or the other what we do
>> here. If everyone thinks we should relax this, then that's fine with
>> me.
>>
>
> Seeing the branch history from userspace that led up to a certain thing
> in the kernel happening doesn't seem like that much of an edge case to
> me. If you always have to have both on then you lose the userspace
> branch history because the buffer isn't that big and gets overwritten.
>
>>> Some of these things could be fixed in Perf, but not in older versions.
>>> Even if we can't think of a real use case now, it doesn't sound like the
>>> driver should be so restrictive of an option that doesn't do any harm.
>>>
>>>>> This also makes the Perf branch test skip because it uses
>>>>> any,save_type,u to see if BRBE exists.
>>>>
>>>> Yes, I plan to update that if we keep this behavior.
>>>>
>>>>>> + (!is_kernel_in_hyp_mode() &&
>>>>>> + (event->attr.exclude_hv != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_HV))))
>>>>>> + return false;
>>>>>> +
>>>>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event-
>>>>>> >attr.branch_sample_type);
>>>>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event-
>>>>>> >attr.branch_sample_type);
>>>>>> +
>>>>>> + return true;
>>>>>> +}
>>>>>> +
>>>>> [...]
>>>>>> +static const int
>>>>>> brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>>>>>
>>>>> Does the second field go into 'new_type'? They all seem to be zero so
>>>>> I'm not sure why new_type isn't ignored instead of having it mapped.
>>>>
>>>> Well, left over from when all the Arm specific types were supported.
>>>> So yeah, that can be simplified.
>>>>
>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>>>>>
>>>>> How do ones that don't map to anything appear in Perf? For example
>>>>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to
>>>>> the
>>>>> previous versions fails because it doesn't see the trap that jumps to
>>>>> the kernel, but it does still see the ERET back to userspace:
>>>>>
>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>>
>>>>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
>>>>> PERF_BR_SYSCALL so you could see it go into the kernel before the
>>>>> return:
>>>>>
>>>>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>
>>>> My read of that was we should see a CALL in this case. Whether SVC
>>>> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
>>>> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
>>>> (and set has additional conditions). We have SVC_EL0 cleared, so that
>>>> should be a CALL. Maybe the FVP has this wrong?
>>>>
>>>
>>> The test is doing this rather than a syscall:
>>>
>>> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
>>>
>>> So I think trap is right. Whether that should be mapped to SYSCALL or
>>> some other branch type I don't know, but the point is that it's
>>> missing now.
>>
>> We aren't supporting any of the Arm specific traps/exceptions. One
>> reason is for consistency with x86 like you just argued for. The only
>
> Does x86 leave holes in the program flow though, or is it complete? IMO
> it makes it harder for tools to make sense of the branch buffer if there
> are things like an ERET with no previous trap to match it up to.
>
>> exception types supported are syscall and IRQ. Part of the issue is
>> there is no userspace control over enabling all the extra Arm ones.
>> There's no way to say enable all branches except debug, fault, etc.
>> exceptions. If we want to support these, I think there should be user
>> control over enabling them. But that can come later if there's any
>> demand for them.
>>
>> Rob
>
> In this patchset we enable PERF_BR_IRQ with PERF_SAMPLE_BRANCH_ANY,
> without any way to selectively disable it. I would assume trap could be
> done with the same option.
>
> If we're filtering some of them out it might be worth documenting that
> "PERF_SAMPLE_BRANCH_ANY" doesn't actually mean 'any' branch type on Arm,
> and some types are recorded but discarded out before sending to userspace.
>
> There could be some confusion when there are partially filled or empty
> branch buffers, and the reason wouldn't be that there weren't any
> branches recorded, but they were all filtered out even with the 'any'
> option.
>
Powered by blists - more mailing lists