[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fc89da87-6070-beda-2437-4089257c55aa@amd.com>
Date: Fri, 2 Oct 2020 16:10:42 -0500
From: Kim Phillips <kim.phillips@....com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, linux-kernel@...r.kernel.org, ak@...ux.intel.com
Subject: Re: [PATCH] perf/x86/intel: Fix n_metric for the canceled group
On 10/2/20 8:16 AM, Liang, Kan wrote:
> On 10/2/2020 7:02 AM, Peter Zijlstra wrote:
>> On Wed, Sep 30, 2020 at 07:29:35AM -0700, kan.liang@...ux.intel.com wrote:
>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>
>>> When a group that has TopDown members is failed to be scheduled, any
>>> later TopDown groups will not return valid values.
>>>
>>> Here is an example.
>>>
>>> A background perf that occupies all the GP counters and the fixed
>>> counter 1.
>>> $perf stat -e "{cycles,cycles,cycles,cycles,cycles,cycles,cycles,
>>> cycles,cycles}:D" -a
>>>
>>> A user monitors a TopDown group. It works well, because the fixed
>>> counter 3 and the PERF_METRICS are available.
>>> $perf stat -x, --topdown -- ./workload
>>> retiring,bad speculation,frontend bound,backend bound,
>>> 18.0,16.1,40.4,25.5,
>>>
>>> Then the user tries to monitor a group that has TopDown members.
>>> Because of the cycles event, the group is failed to be scheduled.
>>> $perf stat -x, -e '{slots,topdown-retiring,topdown-be-bound,
>>> topdown-fe-bound,topdown-bad-spec,cycles}'
>>> -- ./workload
>>> <not counted>,,slots,0,0.00,,
>>> <not counted>,,topdown-retiring,0,0.00,,
>>> <not counted>,,topdown-be-bound,0,0.00,,
>>> <not counted>,,topdown-fe-bound,0,0.00,,
>>> <not counted>,,topdown-bad-spec,0,0.00,,
>>> <not counted>,,cycles,0,0.00,,
>>>
>>> The user tries to monitor a TopDown group again. It doesn't work anymore.
>>> $perf stat -x, --topdown -- ./workload
>>>
>>> ,,,,,
>>>
>>> In a txn, cancel_txn() is to truncate the event_list for a canceled
>>> group and update the number of events added in this transaction.
>>> However, the number of TopDown events added in this transaction is not
>>> updated. The kernel will probably fail to add new Topdown events.
>>>
>>> Check if the canceled group has Topdown events. If so, subtract the
>>> TopDown events from n_metric accordingly.
>>>
>>> Fixes: 7b2c05a15d29 ("perf/x86/intel: Generic support for hardware TopDown metrics")
>>> Reported-by: Andi Kleen <ak@...ux.intel.com>
>>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>>> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
>>> ---
>>
>> Urgh, I'd much rather we add n_txn_metric. But also, while looking at
>> this, don't we have the same problem with n_pair ?
>>
>> Something like this perhaps...
>>
>
> Sure. For the perf metric, the below patch fixes the problem.
>
> Tested-by: Kan Liang <kan.liang@...ux.intel.com>
Tested-by: Kim Phillips <kim.phillips@....com>
Excerpt from test script:
sudo perf stat -e "{cycles,cycles,cycles,cycles}:D" -a sleep 10 &
# should succeed:
sudo perf stat -e "{fp_ret_sse_avx_ops.all}:D" -a workload
# should fail:
sudo perf stat -e "{fp_ret_sse_avx_ops.all,fp_ret_sse_avx_ops.all,cycles}:D" -a workload
# previously failed, now succeeds with this patch:
sudo perf stat -e "{fp_ret_sse_avx_ops.all}:D" -a workload
Thanks both,
Kim
Powered by blists - more mailing lists