[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57902B8A.8040907@huawei.com>
Date: Thu, 21 Jul 2016 09:55:22 +0800
From: zhouchengming <zhouchengming1@...wei.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<akpm@...ux-foundation.org>, <tj@...nel.org>,
<guohanjun@...wei.com>, <huawei.libin@...wei.com>
Subject: Re: [PATCH] make __section_nr more efficient
On 2016/7/21 5:36, Dave Hansen wrote:
> On 07/19/2016 09:18 PM, Zhou Chengming wrote:
>> When CONFIG_SPARSEMEM_EXTREME is disabled, __section_nr can get
>> the section number with a subtraction directly.
>
> Does this actually *do* anything?
>
> It was a long time ago, but if I remember correctly, the entire loop in
> __section_nr() goes away because root_nr==NR_SECTION_ROOTS, so
> root_nr=1, and the compiler optimizes away the entire subtraction.
>
> So this basically adds an #ifdef and gets us nothing, although it makes
> the situation much more explicit. Perhaps the comment should say that
> this works *and* is efficient because the compiler can optimize all the
> extreme complexity away.
>
> .
>
Thanks for your reply. I don't know the compiler will optimize the loop.
But when I see the assembly code of __section_nr, it seems to still have
the loop in it.
My gcc version: gcc version 4.9.0 (GCC)
CONFIG_SPARSEMEM_EXTREME: disabled
Before this patch:
0000000000000000 <__section_nr>:
0: 55 push %rbp
1: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
4: R_X86_64_32S mem_section
8: 31 c0 xor %eax,%eax
a: 48 89 e5 mov %rsp,%rbp
d: eb 0d jmp 1c <__section_nr+0x1c>
f: 48 83 c0 01 add $0x1,%rax
13: 48 81 fa 00 00 00 00 cmp $0x0,%rdx
16: R_X86_64_32S mem_section+0x800000
1a: 74 26 je 42 <__section_nr+0x42>
1c: 48 89 d1 mov %rdx,%rcx
1f: ba 10 00 00 00 mov $0x10,%edx
24: 48 85 c9 test %rcx,%rcx
27: 74 e6 je f <__section_nr+0xf>
29: 48 39 cf cmp %rcx,%rdi
2c: 48 8d 51 10 lea 0x10(%rcx),%rdx
30: 72 dd jb f <__section_nr+0xf>
32: 48 39 d7 cmp %rdx,%rdi
35: 73 d8 jae f <__section_nr+0xf>
37: 48 29 cf sub %rcx,%rdi
3a: 48 c1 ff 04 sar $0x4,%rdi
3e: 01 f8 add %edi,%eax
40: 5d pop %rbp
41: c3 retq
42: 48 29 cf sub %rcx,%rdi
45: b8 00 00 08 00 mov $0x80000,%eax
4a: 48 c1 ff 04 sar $0x4,%rdi
4e: 01 f8 add %edi,%eax
50: 5d pop %rbp
51: c3 retq
52: 66 66 66 66 66 2e 0f data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
59: 1f 84 00 00 00 00 00
After this patch:
0000000000000000 <__section_nr>:
0: 55 push %rbp
1: 48 89 f8 mov %rdi,%rax
4: 48 2d 00 00 00 00 sub $0x0,%rax
6: R_X86_64_32S mem_section
a: 48 89 e5 mov %rsp,%rbp
d: 48 c1 f8 04 sar $0x4,%rax
11: 5d pop %rbp
12: c3 retq
13: 66 66 66 66 2e 0f 1f data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
1a: 84 00 00 00 00 00
Thanks!
Powered by blists - more mailing lists