[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a859c5a1-538a-3dc8-b66e-a767de145501@arm.com>
Date: Mon, 21 May 2018 16:29:58 +0100
From: Robin Murphy <robin.murphy@....com>
To: Suzuki K Poulose <Suzuki.Poulose@....com>,
linux-arm-kernel@...ts.infradead.org
Cc: linux-kernel@...r.kernel.org, mark.rutland@....com,
will.deacon@....com
Subject: Re: [PATCH 6/6] arm64: perf: Add support for chaining counters
On 21/05/18 15:41, Suzuki K Poulose wrote:
> On 21/05/18 15:00, Robin Murphy wrote:
>> On 21/05/18 14:42, Suzuki K Poulose wrote:
>>> On 18/05/18 16:57, Suzuki K Poulose wrote:
>>>> Hi Robin,
>>>>
>>>> On 18/05/18 14:49, Robin Murphy wrote:
>>>>> On 18/05/18 11:22, Suzuki K Poulose wrote:
>>>>>> Add support for chained event counters. PMUv3 allows chaining
>>>>>> a pair of adjacent PMU counters (with the lower counter number
>>>>>> being always "even"). The low counter is programmed to count
>>>>>> the event of interest and the high counter(odd numbered) is
>>>>>> programmed with a special event code (0x1e - Chain). Thus
>>>>>> we need special allocation schemes to make the full use of
>>>>>> available counters. So, we allocate the counters from either
>>>>>> ends. i.e, chained counters are allocated from the lower
>>>>>> end in pairs of two and the normal counters are allocated
>>>>>> from the higher number. Also makes necessary changes to
>>>>>> handle the chained events as a single event with 2 counters.
>>>>>>
>>>>>> Cc: Mark Rutland <mark.rutland@....com>
>>>>>> Cc: Will Deacon <will.deacon@....com>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@....com>
>>>
>>>>
>>>>>> /*
>>>>>> @@ -845,8 +1016,14 @@ static int __armv8_pmuv3_map_event(struct
>>>>>> perf_event *event,
>>>>>> &armv8_pmuv3_perf_cache_map,
>>>>>> ARMV8_PMU_EVTYPE_EVENT);
>>>>>> - if (hw_event_id == ARMV8_PMUV3_PERFCTR_CPU_CYCLES)
>>>>>> + if (hw_event_id == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) {
>>>>>> + /* Prevent chaining for cycle counter */
>>>>>
>>>>> Why? Sure, we want to avoid executing the chaining logic if we're
>>>>> scheduling a cycles event in the dedicated counter (which is
>>>>> perhaps what the comment above wanted to say), but if one ends up
>>>>> allocated into a regular counter (e.g. if the user asks for
>>>>> multiple cycle counts with different filters), then I don't see any
>>>>> reason to forbid that being chained.
>>>>
>>>> Ah, I didn't think about that case. I was under the assumption that the
>>>> cycles are *only* placed on the cycle counter. I will take care of
>>>> that.
>>>> Thanks for the review.
>>>
>>> Robin, Mark, Will
>>>
>>> One potential problem I see with allowing chaining of the cycle counter
>>> *and* the promotion of cycle event to 64bit by default is when the user
>>> may actually be able to count 1 less event (due to the promotion of
>>> cycle event to 64bit and thus forcing to use chain, if the cycle counter
>>> is unavailable).
>>
>> Right, I didn't mean to imply we should inject the "chain" attr
>> automatically for all cycles events, just that we shouldn't be
>> rejecting it if the user does explicitly set it (but then just ignore
>> it if using the dedicated counter).
>
> Right, I was not talking about the automatic "chain" for cycles events. The
> problem is we don't know if we would get the "cycle" counter for the given
> event until it is "added", at which point we have already decided
> whether the event is 32bit or 64bit. So, we cannot really delay the
> decision
> until that. Thats where this comes up. Given a cycle event (without an
> explicit
> chain request), do we treat it as a 64bit event or not ? If we do, we could
>
> 1) get the Cycle counter, all is fine.
>
> 2) If not, fallback to Chaining. The user looses a counter.
Ah, I think I see where our wires might be getting crossed here - taking
a second look at patch #4 I see you're inherently associating
ARMPMU_EVT_LONG with the ARMV8_PMUV3_PERFCTR_CPU_CYCLES event ID. What
I'm thinking of is that we would only set that flag if and when we've
allocated the cycle counter or a chained pair of regular counters (i.e.
in get_event_idx() or later). In other words, it becomes a property of
the counter(s) backing the event, rather than of the hardware event
itself, which I think makes logical sense.
>>> So one option is to drop automatic promotion of the cycle counter to
>>> 64bit and do it only when it is requested by the user and use either the
>>> Cycle counter (preferred) or fall back to chaining. That way, the user
>>> has the control over the number of events he can count using the given
>>> set of counters.
>>
>> Naively, there doesn't seem to be any inherent harm in always using
>> 64-bit arithmetic for the dedicated counter, but it would mean that
>> with multiple (non-chained) cycles events, some would be taking an
>> interrupt every few seconds while one would effectively never
>> overflow. I guess the question is whether that matters or not.
>>
>
> The problem is we can't have a mask per counter as we don't know where
> the event would be placed until it is added. Or we should delay the
> period initialisation/update to post get_event_idx().
...but does indeed mean that we can't initialise period stuff until we
know where the event has been placed. I reckon that is a reasonable
thing to do, but let's see what Mark and Will think.
(I guess there's also an ugly compromise in which we could re-run the
sample_period setup logic as a special case when allocating the cycle
counter, but even I don't like the idea of that)
Robin.
Powered by blists - more mailing lists