[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO6TR8WW6Nkc-StZLHD5_FcF4SjajfQMsrchQVH6-+WYzc2-7A@mail.gmail.com>
Date: Mon, 18 Jan 2016 14:45:14 -0700
From: Jeff Merkey <linux.mdb@...il.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...capital.net>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>, Jiri Olsa <jolsa@...nel.org>
Subject: Re: x86_64 Compiler Output Kernel Bloat v4.4
On 1/18/16, Jeff Merkey <linux.mdb@...il.com> wrote:
> Hi,
>
> I noticed that in the assembler output for the x86_64 builds almost
> every single function originating from C code has a nop instruction
> that prefaces the function call. I guess the concern with this is
> the wasted space issue as each one of these placeholders takes up a
> bunch of bytes at the head of each function. Is there a reason this
> assembler header is there in the first place to anyones knowledge?
> Since every single function just about is prefaced by this inert 5
> byte instruction it adds up to quite a bit of bloat in the size of the
> linux executable.
>
> 0xffffffffa073e010 0F1F440000 nop DWORD PTR [rax+rax]=0x0
>
> The intel assembler format shows the bytes that comprise each
> instruction. The GDB format does not. Both are provided.
>
> 0xffffffffa073e050 4155 push r13
> (0)> id mdb_watchdogs
> mdb|mdb_watchdogs:
> 0xffffffffa073e010 mdb_watchdogs: nopl 0x0(%rax,%rax,1)) <<
> 0xffffffffa073e015 mdb_watchdogs+0x5: push %rbp
> 0xffffffffa073e016 mdb_watchdogs+0x6: mov %rsp,%rbp
> 0xffffffffa073e019 mdb_watchdogs+0x9: callq 0xffffffff811337e0
> touch_softlockup_watchdog_sync
> 0xffffffffa073e01e mdb_watchdogs+0xe: callq 0xffffffff810f0ba0
> clocksource_touch_watchdog
> 0xffffffffa073e023 mdb_watchdogs+0x13: callq 0xffffffff810dea20
> rcu_cpu_stall_reset
> 0xffffffffa073e028 mdb_watchdogs+0x18: callq 0xffffffff811337c0
> touch_nmi_watchdog
> 0xffffffffa073e02d mdb_watchdogs+0x1d: pop %rbp
> 0xffffffffa073e02e mdb_watchdogs+0x1e: data16
> 0xffffffffa073e030 mdb_watchdogs+0x20: retq
> 0xffffffffa073e031 mdb_watchdogs+0x21: nopw %cs:0x0(%rax,%rax,1))
> mdb|mdb:
> 0xffffffffa073e040 mdb: nopl 0x0(%rax,%rax,1)) <<
> 0xffffffffa073e045 mdb+0x5: push %rbp
> 0xffffffffa073e046 mdb+0x6: mov %rsp,%rbp
> 0xffffffffa073e049 mdb+0x9: push %r15
> 0xffffffffa073e04b mdb+0xb: push %r14
> 0xffffffffa073e04d mdb+0xd: mov %rdi,%r14
> 0xffffffffa073e050 mdb+0x10: push %r13
> (0)> u mdb_watchdogs
> mdb|mdb_watchdogs:
> 0xffffffffa073e010 0F1F440000 nop DWORD PTR [rax+rax]=0x0 <<
> 0xffffffffa073e015 55 push rbp
> 0xffffffffa073e016 4889E5 mov rbp,rsp
> 0xffffffffa073e019 E8C2579FE0 call touch_softlockup_watchdog_sync
> 0xffffffffa073e01e E87D2B9BE0 call clocksource_touch_watchdog
> 0xffffffffa073e023 E8F8099AE0 call rcu_cpu_stall_reset
> 0xffffffffa073e028 E893579FE0 call touch_nmi_watchdog
> 0xffffffffa073e02d 5D pop rbp
> 0xffffffffa073e02e 6690 data16
> 0xffffffffa073e030 C3 ret
> 0xffffffffa073e031 6666666666662E0F1F840000000000 nop cs:WORD PTR
> [rax+rax]=0x0000
> mdb|mdb:
> 0xffffffffa073e040 0F1F440000 nop DWORD PTR [rax+rax]=0x0 <<
> 0xffffffffa073e045 55 push rbp
> 0xffffffffa073e046 4889E5 mov rbp,rsp
> 0xffffffffa073e049 4157 push r15
> 0xffffffffa073e04b 4156 push r14
> 0xffffffffa073e04d 4989FE mov r14,rdi
> 0xffffffffa073e050 4155 push r13
> (0)> g
>
> Jeff
>
I think xor eax,eax is a lot shorter and fewer bytes.
Jeff
Powered by blists - more mailing lists