lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 27 Mar 2021 10:15:56 +0800 From: Yang Jihong <yangjihong1@...wei.com> To: Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org> CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, Mark Rutland <mark.rutland@....com>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...hat.com>, Yao Jin <yao.jin@...ux.intel.com>, <gustavoars@...nel.org>, <mliska@...e.cz>, linux-kernel <linux-kernel@...r.kernel.org>, <zhangjinhao2@...wei.com> Subject: Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode Hello, On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: > Em Fri, Mar 26, 2021 at 12:25:37PM +0900, Namhyung Kim escreveu: >> On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@...wei.com> wrote: >>> On 2021/3/19 20:35, Yang Jihong wrote: >>>> In hist__find_annotations function, since different hist_entry may point to same >>>> symbol, we free notes->src to signal already processed this symbol in stdio mode; >>>> when annotate, entry will skipped if notes->src is NULL to avoid repeated output. >>>> >>>> However, there is a problem, for example, run the following command: >>>> >>>> # perf record -e branch-misses -e branch-instructions -a sleep 1 >>>> >>>> perf.data file contains different types of sample event. >>>> >>>> If the same IP sample event exists in branch-misses and branch-instructions, >>>> this event uses the same symbol. When annotate branch-misses events, notes->src >>>> corresponding to this event is set to null, as a result, when annotate >>>> branch-instructions events, this event is skipped and no annotate is output. >>>> >>>> Solution of this patch is to remove zfree in hists__find_annotations and >>>> change sort order to "dso,symbol" to avoid duplicate output when different >>>> processes correspond to the same symbol. > >>>> Signed-off-by: Yang Jihong <yangjihong1@...wei.com> > >> Acked-by: Namhyung Kim <namhyung@...nel.org> > > Without looking at the patch, just at its description of the problem, I > tried to annotate two events in a group, to get the annotate group view > output with both events, and it seems I'm getting samples accounted for > both events: > > [root@...e ~]# perf record -e '{branch-misses,branch-instructions}' -a sleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 2.296 MB perf.data (2507 samples) ] > [root@...e ~]# > [root@...e ~]# perf report | grep -v '^#' | head -5 > 17.49% 19.19% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarking::Run > 12.17% 17.04% ThreadPoolForeg chromium-browser [.] v8::internal::Sweeper::RawSweep > 11.14% 11.63% ThreadPoolForeg chromium-browser [.] v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarkingVisitor, v8::internal::ConcurrentMarkingState>::ProcessStrongHeapObject<v8::internal::CompressedHeapObjectSlot> > 7.65% 7.84% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::ShouldVisit > 5.66% 6.72% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::VisitPointersInSnapshot > > [root@...e ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run > Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] > v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser > Percent > > > Disassembly of section .text: > > 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: > v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): > push %rbp > mov %rsp,%rbp > push %r15 > push %r14 > mov %rdi,%r14 > push %r13 > mov %edx,%r13d > push %r12 > mov %ecx,%r12d > push %rbx > sub $0x1298,%rsp > mov %rsi,-0x1228(%rbp) > mov %fs:0x28,%rax > mov %rax,-0x38(%rbp) > <SNIP> > movzwl 0x2(%rbx),%eax > test %ax,%ax > ↓ jne 4a9 > mov -0x10e8(%rbp),%rdx > cmpw $0x0,0x2(%rdx) > 0.41 0.39 ↓ je 4b90 > movq %rbx,%xmm0 > movq %rdx,%xmm2 > mov %rdx,%rbx > punpcklqdq %xmm2,%xmm0 > movups %xmm0,-0x10e8(%rbp) > movzwl 0x2(%rdx),%eax > 4a9: sub $0x1,%eax > mov %ax,0x2(%rbx) > 0.36 0.91 movzwl %ax,%eax > 0.60 0.00 mov 0x10(%rbx,%rax,8),%rax > 3.44 2.46 mov %rax,-0x11e0(%rbp) > 0.00 0.34 4bf: mov 0x8(%r13),%rax > 0.00 0.36 add $0x1,%r15d > 0.00 0.34 mov 0x110(%rax),%rax > mov 0x128(%rax),%rcx > 0.88 0.36 mov 0x8(%r13),%rax > mov 0x110(%rax),%rdx > mov 0x130(%rdx),%rdx > 0.00 0.48 mov 0x140(%rax),%rax > mov 0x110(%rax),%rsi > 0.61 0.47 mov -0x11e0(%rbp),%rax > 2.01 2.32 sub $0x1,%rax > cmp %rcx,%rax > 0.00 0.35 setae %cl > 1.31 0.33 cmp %rdx,%rax > setb %dl > 0.00 0.24 test %dl,%cl > 0.00 0.12 ↓ jne 4b70 > cmp %rsi,%rax > ↓ je 4b70 > mov (%rax),%eax > 29.10 29.90 add -0x1220(%rbp),%rax > cmpb $0x0,-0x1218(%rbp) > mov %rax,-0x1210(%rbp) > 0.00 0.65 ↓ jne 4fa0 > mov -0x11e0(%rbp),%r9 > lea 0x6(%rax),%rbx > 0.38 0.00 545: movzbl (%rbx),%eax > 4.90 5.34 cmp $0x4c,%al > ↓ ja 5026 > 0.58 0.00 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi > movslq (%rdi,%rax,4),%rax > 4.97 3.19 add %rdi,%rax > <SNIP> > > If I ask for number of samples: > > [root@...e ~]# perf config annotate.show_nr_samples=true > [root@...e ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run > > Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] > v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser > Samples > > > Disassembly of section .text: > > 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: > v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): > push %rbp > mov %rsp,%rbp > push %r15 > push %r14 > mov %rdi,%r14 > push %r13 > mov %edx,%r13d > push %r12 > mov %ecx,%r12d > push %rbx > sub $0x1298,%rsp > mov %rsi,-0x1228(%rbp) > mov %fs:0x28,%rax > mov %rax,-0x38(%rbp) > <SNIP> > movzwl 0x2(%rbx),%eax > test %ax,%ax > ↓ jne 4a9 > mov -0x10e8(%rbp),%rdx > cmpw $0x0,0x2(%rdx) > 1 1 ↓ je 4b90 > movq %rbx,%xmm0 > movq %rdx,%xmm2 > mov %rdx,%rbx > punpcklqdq %xmm2,%xmm0 > movups %xmm0,-0x10e8(%rbp) > movzwl 0x2(%rdx),%eax > 4a9: sub $0x1,%eax > mov %ax,0x2(%rbx) > 1 2 movzwl %ax,%eax > 1 0 mov 0x10(%rbx,%rax,8),%rax > 8 5 mov %rax,-0x11e0(%rbp) > 0 1 4bf: mov 0x8(%r13),%rax > 0 1 add $0x1,%r15d > 0 1 mov 0x110(%rax),%rax > mov 0x128(%rax),%rcx > 3 1 mov 0x8(%r13),%rax > mov 0x110(%rax),%rdx > mov 0x130(%rdx),%rdx > 0 1 mov 0x140(%rax),%rax > mov 0x110(%rax),%rsi > 2 1 mov -0x11e0(%rbp),%rax > 6 6 sub $0x1,%rax > cmp %rcx,%rax > 0 1 setae %cl > 2 1 cmp %rdx,%rax > setb %dl > 0 1 test %dl,%cl > 0 1 ↓ jne 4b70 > cmp %rsi,%rax > ↓ je 4b70 > mov (%rax),%eax > 58 73 add -0x1220(%rbp),%rax > cmpb $0x0,-0x1218(%rbp) > mov %rax,-0x1210(%rbp) > 0 1 ↓ jne 4fa0 > mov -0x11e0(%rbp),%r9 > lea 0x6(%rax),%rbx > 1 0 545: movzbl (%rbx),%eax > 10 13 cmp $0x4c,%al > ↓ ja 5026 > 1 0 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi > movslq (%rdi,%rax,4),%rax > 9 8 add %rdi,%rax > > <SNIP> > > So it seems to be working, what am I missing? Is this strictly non > group related? > Yes, it is non group related. This problem occurs only when different events need to be recorded at the same time, i.e.: perf record -e branch-misses -e branch-instructions -a sleep 1 The output results of perf script and perf annotate do not match. Some events are not output in perf annotate. > - Arnaldo > . >
Powered by blists - more mailing lists