linux-kernel - Re: [PATCH 6/9] perf report: Support instruction latency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <00b8e412-6756-630a-c0d2-4be7ad8948d4@linux.intel.com>
Date:   Mon, 8 Feb 2021 08:50:14 -0500
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Stephane Eranian <eranian@...gle.com>,
        Jiri Olsa <jolsa@...hat.com>, Andi Kleen <ak@...ux.intel.com>,
        Yao Jin <yao.jin@...ux.intel.com>, maddy@...ux.vnet.ibm.com
Subject: Re: [PATCH 6/9] perf report: Support instruction latency



On 2/6/2021 3:09 AM, Namhyung Kim wrote:
> On Fri, Feb 5, 2021 at 11:38 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>> On 2/5/2021 6:08 AM, Namhyung Kim wrote:
>>> On Wed, Feb 3, 2021 at 5:14 AM <kan.liang@...ux.intel.com> wrote:
>>>>
>>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>>
>>>> The instruction latency information can be recorded on some platforms,
>>>> e.g., the Intel Sapphire Rapids server. With both memory latency
>>>> (weight) and the new instruction latency information, users can easily
>>>> locate the expensive load instructions, and also understand the time
>>>> spent in different stages. The users can optimize their applications
>>>> in different pipeline stages.
>>>>
>>>> The 'weight' field is shared among different architectures. Reusing the
>>>> 'weight' field may impacts other architectures. Add a new field to store
>>>> the instruction latency.
>>>>
>>>> Like the 'weight' support, introduce a 'ins_lat' for the global
>>>> instruction latency, and a 'local_ins_lat' for the local instruction
>>>> latency version.
>>>
>>> Could you please clarify the difference between the global latency
>>> and the local latency?
>>>
>>
>> The global means the total latency.
>> The local means average latency, aka total / number of samples.
> 
> Thanks for the explanation, but I think it's confusing.
> Why not call it just total_latency and avg_latency?
> 

The instruction latency field is an extension of the weight field, so I 
follow the same way to name the field. I still think we should make the 
naming consistency.

To address the confusion, I think we may update the document for both 
the weight and the instruction latency fields.

How about the below patch?

 From d5e80f541cb7288b24a7c5661ae5faede4747807 Mon Sep 17 00:00:00 2001
From: Kan Liang <kan.liang@...ux.intel.com>
Date: Mon, 8 Feb 2021 05:27:03 -0800
Subject: [PATCH] perf documentation: Add comments to the local/global 
weight related fields

Current 'local' and 'global' prefix is confusing for the weight related
fields, e.g., weight, instruction latency.

Add comments to clarify.
'global' means total weight/instruction latency sum.
'local' means average weight/instruction latency per sample

Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
---
  tools/perf/Documentation/perf-report.txt | 10 ++++++----
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index f546b5e..acc1c1d 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -92,8 +92,9 @@ OPTIONS
  	- srcfile: file name of the source file of the samples. Requires dwarf
  	information.
  	- weight: Event specific weight, e.g. memory latency or transaction
-	abort cost. This is the global weight.
-	- local_weight: Local weight version of the weight above.
+	abort cost. This is the global weight (total weight sum).
+	- local_weight: Local weight (average weight per sample) version of the
+	  weight above.
  	- cgroup_id: ID derived from cgroup namespace device and inode numbers.
  	- cgroup: cgroup pathname in the cgroupfs.
  	- transaction: Transaction abort flags.
@@ -110,8 +111,9 @@ OPTIONS
  	--time-quantum (default 100ms). Specify with overhead and before it.
  	- code_page_size: the code page size of sampled code address (ip)
  	- ins_lat: Instruction latency in core cycles. This is the global 
instruction
-	  latency
-	- local_ins_lat: Local instruction latency version
+	  latency (total instruction latency sum)
+	- local_ins_lat: Local instruction latency (average instruction 
latency per
+	  sample) version

  	By default, comm, dso and symbol keys are used.
  	(i.e. --sort comm,dso,symbol)
-- 
2.7.4


Thanks,
Kan