[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a995445f-9b00-ca13-d23a-1aea3b345718@c-s.fr>
Date: Sat, 11 Jan 2020 08:05:34 +0000
From: Christophe Leroy <christophe.leroy@....fr>
To: Thomas Gleixner <tglx@...utronix.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>, arnd@...db.de,
vincenzo.frascino@....com, luto@...nel.org
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-arm-kernel@...ts.infradead.org, linux-mips@...r.kernel.org,
x86@...nel.org
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in
__c_kernel_time()
On 01/10/2020 09:12 PM, Thomas Gleixner wrote:
> Christophe Leroy <christophe.leroy@....fr> writes:
>>
>> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
>> index 17b4cff6e5f0..5a17a9d2e6cd 100644
>> --- a/lib/vdso/gettimeofday.c
>> +++ b/lib/vdso/gettimeofday.c
>> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
>> static __maybe_unused __kernel_old_time_t
>> __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
>> {
>> - __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
>> + __kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>>
>> if (time)
>> *time = t;
>
> Allows the compiler to load twice, i.e. the returned value might be different from the
> stored value. So no.
>
With READ_ONCE() the 64 bits are being read:
00000ac8 <__c_kernel_time>:
ac8: 2c 03 00 00 cmpwi r3,0
acc: 81 44 00 20 lwz r10,32(r4)
ad0: 81 64 00 24 lwz r11,36(r4)
ad4: 41 82 00 08 beq adc <__c_kernel_time+0x14>
ad8: 91 63 00 00 stw r11,0(r3)
adc: 7d 63 5b 78 mr r3,r11
ae0: 4e 80 00 20 blr
Without the READ_ONCE() only 32 bits are read. That's the most optimal.
00000ac8 <__c_kernel_time>:
ac8: 7c 69 1b 79 mr. r9,r3
acc: 80 64 00 24 lwz r3,36(r4)
ad0: 4d 82 00 20 beqlr
ad4: 90 69 00 00 stw r3,0(r9)
ad8: 4e 80 00 20 blr
Without READ_ONCE() but with a barrier() after the read, we should get
the same result but GCC (GCC 8.1) does less good:
00000ac8 <__c_kernel_time>:
ac8: 81 24 00 24 lwz r9,36(r4)
acc: 2f 83 00 00 cmpwi cr7,r3,0
ad0: 41 9e 00 08 beq cr7,ad8 <__c_kernel_time+0x10>
ad4: 91 23 00 00 stw r9,0(r3)
ad8: 7d 23 4b 78 mr r3,r9
adc: 4e 80 00 20 blr
Assuming both part of the 64 bits data will fall into a single
cacheline, the second read is in the noise.
So agreed to drop this change.
Christophe
Powered by blists - more mailing lists