[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <00b8e412-6756-630a-c0d2-4be7ad8948d4@linux.intel.com>
Date: Mon, 8 Feb 2021 08:50:14 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Ingo Molnar <mingo@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Stephane Eranian <eranian@...gle.com>,
Jiri Olsa <jolsa@...hat.com>, Andi Kleen <ak@...ux.intel.com>,
Yao Jin <yao.jin@...ux.intel.com>, maddy@...ux.vnet.ibm.com
Subject: Re: [PATCH 6/9] perf report: Support instruction latency
On 2/6/2021 3:09 AM, Namhyung Kim wrote:
> On Fri, Feb 5, 2021 at 11:38 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>> On 2/5/2021 6:08 AM, Namhyung Kim wrote:
>>> On Wed, Feb 3, 2021 at 5:14 AM <kan.liang@...ux.intel.com> wrote:
>>>>
>>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>>
>>>> The instruction latency information can be recorded on some platforms,
>>>> e.g., the Intel Sapphire Rapids server. With both memory latency
>>>> (weight) and the new instruction latency information, users can easily
>>>> locate the expensive load instructions, and also understand the time
>>>> spent in different stages. The users can optimize their applications
>>>> in different pipeline stages.
>>>>
>>>> The 'weight' field is shared among different architectures. Reusing the
>>>> 'weight' field may impacts other architectures. Add a new field to store
>>>> the instruction latency.
>>>>
>>>> Like the 'weight' support, introduce a 'ins_lat' for the global
>>>> instruction latency, and a 'local_ins_lat' for the local instruction
>>>> latency version.
>>>
>>> Could you please clarify the difference between the global latency
>>> and the local latency?
>>>
>>
>> The global means the total latency.
>> The local means average latency, aka total / number of samples.
>
> Thanks for the explanation, but I think it's confusing.
> Why not call it just total_latency and avg_latency?
>
The instruction latency field is an extension of the weight field, so I
follow the same way to name the field. I still think we should make the
naming consistency.
To address the confusion, I think we may update the document for both
the weight and the instruction latency fields.
How about the below patch?
From d5e80f541cb7288b24a7c5661ae5faede4747807 Mon Sep 17 00:00:00 2001
From: Kan Liang <kan.liang@...ux.intel.com>
Date: Mon, 8 Feb 2021 05:27:03 -0800
Subject: [PATCH] perf documentation: Add comments to the local/global
weight related fields
Current 'local' and 'global' prefix is confusing for the weight related
fields, e.g., weight, instruction latency.
Add comments to clarify.
'global' means total weight/instruction latency sum.
'local' means average weight/instruction latency per sample
Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
---
tools/perf/Documentation/perf-report.txt | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt
b/tools/perf/Documentation/perf-report.txt
index f546b5e..acc1c1d 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -92,8 +92,9 @@ OPTIONS
- srcfile: file name of the source file of the samples. Requires dwarf
information.
- weight: Event specific weight, e.g. memory latency or transaction
- abort cost. This is the global weight.
- - local_weight: Local weight version of the weight above.
+ abort cost. This is the global weight (total weight sum).
+ - local_weight: Local weight (average weight per sample) version of the
+ weight above.
- cgroup_id: ID derived from cgroup namespace device and inode numbers.
- cgroup: cgroup pathname in the cgroupfs.
- transaction: Transaction abort flags.
@@ -110,8 +111,9 @@ OPTIONS
--time-quantum (default 100ms). Specify with overhead and before it.
- code_page_size: the code page size of sampled code address (ip)
- ins_lat: Instruction latency in core cycles. This is the global
instruction
- latency
- - local_ins_lat: Local instruction latency version
+ latency (total instruction latency sum)
+ - local_ins_lat: Local instruction latency (average instruction
latency per
+ sample) version
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
--
2.7.4
Thanks,
Kan
Powered by blists - more mailing lists