[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56F52E83.70409@huawei.com>
Date: Fri, 25 Mar 2016 20:26:43 +0800
From: "Wangnan (F)" <wangnan0@...wei.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <mingo@...hat.com>, <linux-kernel@...r.kernel.org>,
He Kuang <hekuang@...wei.com>,
Alexei Starovoitov <ast@...nel.org>,
"Arnaldo Carvalho de Melo" <acme@...hat.com>,
Brendan Gregg <brendan.d.gregg@...il.com>,
"Jiri Olsa" <jolsa@...nel.org>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Namhyung Kim <namhyung@...nel.org>,
Zefan Li <lizefan@...wei.com>, <pi3orama@....com>
Subject: Re: [PATCH 3/5] perf core: Prepare writing into ring buffer from end
On 2016/3/23 17:50, Peter Zijlstra wrote:
> On Mon, Mar 14, 2016 at 09:59:43AM +0000, Wang Nan wrote:
>> Convert perf_output_begin to __perf_output_begin and make the later
>> function able to write records from the end of the ring buffer.
>> Following commits will utilize the 'backward' flag.
>>
>> This patch doesn't introduce any extra performance overhead since we
>> use always_inline.
> So while I agree that with __always_inline and constant propagation we
> _should_ end up with the same code, we have:
>
> $ size defconfig-build/kernel/events/ring_buffer.o.{pre,post}
> text data bss dec hex filename
> 3785 2 0 3787 ecb defconfig-build/kernel/events/ring_buffer.o.pre
> 3673 2 0 3675 e5b defconfig-build/kernel/events/ring_buffer.o.post
>
> The patch actually makes the file shrink.
>
> So I think we still want to have some actual performance numbers.
In my environment the two objects are nearly idential:
$ objdump -d kernel/events/ring_buffer.o.new > ./out.new.S
$ objdump -d kernel/events/ring_buffer.o.old > ./out.old.S
--- ./out.old.S 2016-03-25 12:18:52.060656423 +0000
+++ ./out.new.S 2016-03-25 12:18:45.376630269 +0000
@@ -1,5 +1,5 @@
-kernel/events/ring_buffer.o.old: file format elf64-x86-64
+kernel/events/ring_buffer.o.new: file format elf64-x86-64
Disassembly of section .text:
@@ -320,7 +320,7 @@
402: 4d 8d 04 0f lea (%r15,%rcx,1),%r8
406: 48 89 c8 mov %rcx,%rax
409: 4c 0f b1 43 40 cmpxchg %r8,0x40(%rbx)
- 40e: 48 39 c8 cmp %rcx,%rax
+ 40e: 48 39 c1 cmp %rax,%rcx
411: 75 b4 jne 3c7 <perf_output_begin+0xc7>
413: 48 8b 73 58 mov 0x58(%rbx),%rsi
417: 48 8b 43 68 mov 0x68(%rbx),%rax
@@ -357,7 +357,7 @@
480: 85 c0 test %eax,%eax
482: 0f 85 02 ff ff ff jne 38a <perf_output_begin+0x8a>
488: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
- 48f: be 7c 00 00 00 mov $0x7c,%esi
+ 48f: be 89 00 00 00 mov $0x89,%esi
494: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
49b: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # 4a2
<perf_output_begin+0x1a2>
4a2: e8 00 00 00 00 callq 4a7 <perf_output_begin+0x1a7>
@@ -874,7 +874,7 @@
c39: eb e7 jmp c22 <perf_aux_output_begin+0x172>
c3b: 80 3d 00 00 00 00 00 cmpb $0x0,0x0(%rip) # c42
<perf_aux_output_begin+0x192>
c42: 75 93 jne bd7 <perf_aux_output_begin+0x127>
- c44: be 2b 01 00 00 mov $0x12b,%esi
+ c44: be 49 01 00 00 mov $0x149,%esi
c49: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
c50: e8 00 00 00 00 callq c55 <perf_aux_output_begin+0x1a5>
c55: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # c5c
<perf_aux_output_begin+0x1ac>
I think you enabled some unusual config options?
Thank you.
Powered by blists - more mailing lists