[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <554B290D.6000005@redhat.com>
Date: Thu, 07 May 2015 10:57:49 +0200
From: Denys Vlasenko <dvlasenk@...hat.com>
To: "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>
CC: Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>,
Andy Lutomirski <luto@...capital.net>,
Frederic Weisbecker <fweisbec@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Will Drewry <wad@...omium.org>,
Kees Cook <keescook@...omium.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: Deinline cpuid_eax and friends
On 05/06/2015 10:41 PM, H. Peter Anvin wrote:
> On 05/06/2015 12:09 PM, Denys Vlasenko wrote:
>>>
>>> How on Earth does it make 44 bytes? Is this due to paravirt_fail?
>>
>> No, just this construct
>>
>> unsigned int eax, ebx, ecx, edx;
>> cpuid(op, &eax, &ebx, &ecx, &edx);
>>
>> is not really that cheap to set up. You need to allocate
>> variables on stack and take address of each:
>>
>> ffffffff81063668 <cpuid_eax>:
>> ffffffff81063668: 55 push %rbp
>> ffffffff81063669: 48 89 e5 mov %rsp,%rbp
>> ffffffff8106366c: 48 83 ec 10 sub $0x10,%rsp
>> ffffffff81063670: 48 8d 4d fc lea -0x4(%rbp),%rcx
>> ffffffff81063674: 89 7d f0 mov %edi,-0x10(%rbp)
>> ffffffff81063677: 48 8d 55 f8 lea -0x8(%rbp),%rdx
>> ffffffff8106367b: 48 8d 75 f4 lea -0xc(%rbp),%rsi
>> ffffffff8106367f: 48 8d 7d f0 lea -0x10(%rbp),%rdi
>> ffffffff81063683: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%rbp)
>> ffffffff8106368a: e8 3c ff ff ff callq ffffffff810635cb <__cpuid>
>> ffffffff8106368f: 8b 45 f0 mov -0x10(%rbp),%eax
>> ffffffff81063692: c9 leaveq
>> ffffffff81063693: c3 retq
>>
>
> That almost certainly is due to paravirt_fail, because otherwise cpuid
> would be inline, and gcc actually knows how to optimize around the cpuid
> instruction to the point of eliminating the temporaries.
Yes, with HYPERVISOR_GUEST off cpuid_eax() is smaller:
ffffffff81055a66 <cpuid_eax>:
ffffffff81055a66: 55 push %rbp
ffffffff81055a67: 89 f8 mov %edi,%eax
ffffffff81055a69: 31 c9 xor %ecx,%ecx
ffffffff81055a6b: 48 89 e5 mov %rsp,%rbp
ffffffff81055a6e: 53 push %rbx
ffffffff81055a6f: 0f a2 cpuid
ffffffff81055a71: 5b pop %rbx
ffffffff81055a72: 5d pop %rbp
ffffffff81055a73: c3 retq
However, it is not small enough to make vmlinux grow:
text data bss dec hex filename
81746530 13978160 20066304 115790994 6e6d492 vmlinux.before
81746509 13978160 20066304 115790973 6e6d47d vmlinux
To recap: with this patch
Code is smaller with and without HYPERVISOR_GUEST.
Slowdown per cpuid_REG() call is at worst 4%.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists