[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19e192a7-f8e1-2f04-48fb-8ea668ba32ca@arm.com>
Date: Thu, 27 Jun 2019 16:34:51 +0100
From: Vincenzo Frascino <vincenzo.frascino@....com>
To: Dave Martin <Dave.Martin@....com>
Cc: linux-arch@...r.kernel.org, Shijith Thotton <sthotton@...vell.com>,
Peter Collingbourne <pcc@...gle.com>,
Arnd Bergmann <arnd@...db.de>,
Huw Davies <huw@...eweavers.com>,
Andre Przywara <andre.przywara@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Will Deacon <will.deacon@....com>, linux-mips@...r.kernel.org,
Ralf Baechle <ralf@...ux-mips.org>,
linux-kernel@...r.kernel.org, Paul Burton <paul.burton@...s.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
linux-kselftest@...r.kernel.org,
Catalin Marinas <catalin.marinas@....com>,
Russell King <linux@...linux.org.uk>,
Dmitry Safonov <0x7f454c46@...il.com>,
Mark Salyzyn <salyzyn@...roid.com>,
Shuah Khan <shuah@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v7 04/25] arm64: Substitute gettimeofday with C
implementation
Hi Dave,
On 6/27/19 3:38 PM, Dave Martin wrote:
> On Thu, Jun 27, 2019 at 12:59:07PM +0100, Vincenzo Frascino wrote:
>> On 6/27/19 12:27 PM, Dave Martin wrote:
>>> On Thu, Jun 27, 2019 at 11:57:36AM +0100, Vincenzo Frascino wrote:
>
> [...]
>
>>>> Disassembly of section .text:
>>>> 0000000000000000 show_it:
>>>> 0: e8 03 1f aa mov x8, xzr
>>>> 4: 09 68 68 38 ldrb w9, [x0, x8]
>>>> 8: 08 05 00 91 add x8, x8, #1
>>>> c: c9 ff ff 34 cbz w9, #-8 <show_it+0x4>
>>>> 10: 02 05 00 51 sub w2, w8, #1
>>>> 14: e1 03 00 aa mov x1, x0
>>>> 18: 08 08 80 d2 mov x8, #64
>>>> 1c: 01 00 00 d4 svc #0
>>>> 20: c0 03 5f d6 ret
>>>>
>>>> Commands used:
>>>>
>>>> $ clang -target aarch64-linux-gnueabi main.c -O -c -o main.clang.<x>.o
>>>> $ llvm-objdump -d main.clang.<x>.o
>>>
>>> Actually, I'm not sure this is comparable with the reproducer I quoted
>>> in my last reply.
>>>
>>
>> As explained in my previous email, this is the only case that can realistically
>> happen. vDSO has no dependency on any other library (i.e. libgcc you were
>> mentioning) and we are referring to the fallbacks which fall in this category.
>
> Outlining could also introduce a local function call where none exists
> explicitly in the program IIUC.
>
> My point is that the interaction between asm reg vars and machine-level
> procedure calls is at best ill-defined, and it is largely up to the
> compiler when to introduce such a call, even without LTO etc.
>
> So we should not be surprised to see variations in behaviour depending
> on compiler, compiler version and compiler flags.
>
I tested 10 version of the compiler and a part gcc-5.1 that triggers the issue
in a specific case and not in the vdso library, I could not find evidence of the
problem.
>>> The compiler can see the definition of strlen and fully inlines it.
>>> I only ever saw the problem when the compiler emits an out-of-line
>>> implicit function call.
>>>> What does clang do with my example on 32-bit?
>>
>> When clang is selected compat vDSOs are currently disabled on arm64, will be
>> introduced with a future patch series.
>>
>> Anyway since I am curious as well, this is what happens with your example with
>> clang.8 target=arm-linux-gnueabihf:
>>
>> dave-code.clang.8.o: file format ELF32-arm-little
>>
>> Disassembly of section .text:
>> 0000000000000000 foo:
>> 0: 00 00 00 ef svc #0
>> 4: 1e ff 2f e1 bx lr
>>
>> 0000000000000008 bar:
>> 8: 10 4c 2d e9 push {r4, r10, r11, lr}
>> c: 08 b0 8d e2 add r11, sp, #8
>> 10: 00 40 a0 e1 mov r4, r0
>> 14: fe ff ff eb bl #-8 <bar+0xc>
>> 18: 00 10 a0 e1 mov r1, r0
>> 1c: 04 00 a0 e1 mov r0, r4
>> 20: 00 00 00 ef svc #0
>> 24: 10 8c bd e8 pop {r4, r10, r11, pc}
>
>> Compiled with -O2, -O3, -Os never inlines.
>
> Looks sane, and is the behaviour we want.
>
>> Same thing happens for aarch64-linux-gnueabi:
>>
>> dave-code.clang.8.o: file format ELF64-aarch64-little
>>
>> Disassembly of section .text:
>> 0000000000000000 foo:
>> 0: e0 03 00 2a mov w0, w0
>> 4: e1 03 01 2a mov w1, w1
>> 8: 01 00 00 d4 svc #0
>> c: c0 03 5f d6 ret
>>
>> 0000000000000010 bar:
>> 10: 01 0c c1 1a sdiv w1, w0, w1
>> 14: e0 03 00 2a mov w0, w0
>> 18: 01 00 00 d4 svc #0
>> 1c: c0 03 5f d6 ret
>
> Curious, clang seems to be inserting some seemingly redundant moves
> of its own here, though this shouldn't break anything.
>
> I suspect that clang might require an X-reg holding an int to have its
> top 32 bits zeroed for passing to an asm, whereas GCC does not. I think
> this comes under "we should not be surprised to see variations".
>
> GCC 9 does this instead:
>
> 0000000000000000 <foo>:
> 0: d4000001 svc #0x0
> 4: d65f03c0 ret
>
> 0000000000000008 <bar>:
> 8: 1ac10c01 sdiv w1, w0, w1
> c: d4000001 svc #0x0
> 10: d65f03c0 ret
>
>
>> Based on this I think we can conclude our investigation.
>
> So we use non-reg vars and use the asm clobber list and explicit moves
> to get things into / out of the right registers?
>
Since I managed to provide enough evidence, based on the behavior of various
versions of the compilers, that the library as it stands is consistent and does
not suffer any of the issues you reported I think I will keep my code as is at
least for this release, I will revisit it in future if something happens.
If you manage to prove that my library as it stands (no code additions or source
modifications) has the issues you mentioned based on some version of the
compiler, this changes everything.
Happy to hear from you.
> Cheers
> ---Dave
>
--
Regards,
Vincenzo
Powered by blists - more mailing lists