[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53A015B3.2070809@linaro.org>
Date: Tue, 17 Jun 2014 11:17:23 +0100
From: Daniel Thompson <daniel.thompson@...aro.org>
To: Russell King - ARM Linux <linux@....linux.org.uk>
CC: Rob Clark <robdclark@...il.com>,
Nicolas Pitre <nicolas.pitre@...aro.org>,
Arnd Bergmann <arnd.bergmann@...aro.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
patches@...aro.org, linaro-kernel@...ts.linaro.org
Subject: Re: [PATCH v3] ARM: add get_user() support for 8 byte types
On 12/06/14 16:58, Russell King - ARM Linux wrote:
> On Thu, Jun 12, 2014 at 04:42:35PM +0100, Daniel Thompson wrote:
>> A new atomic modeset/pageflip ioctl being developed in DRM requires
>> get_user() to work for 64bit types (in addition to just put_user()).
>>
>> v1: original
>> v2: pass correct size to check_uaccess, and better handling of narrowing
>> double word read with __get_user_xb() (Russell King's suggestion)
>> v3: fix a couple of checkpatch issues
>
> This is still unsafe.
>
>> #define __get_user_check(x,p) \
>> ({ \
>> unsigned long __limit = current_thread_info()->addr_limit - 1; \
>> register const typeof(*(p)) __user *__p asm("r0") = (p);\
>> - register unsigned long __r2 asm("r2"); \
>> + register typeof(x) __r2 asm("r2"); \
>
> So, __r2 becomes the type of 'x'. If 'x' is a 64-bit type, and *p is
> an 8-bit, 16-bit, or 32-bit type, this fails horribly by leaving the
> upper word of __r2 undefined.
It is true that at after the switch statement the contents of r3 are
undefined. However...
> register unsigned long __l asm("r1") = __limit; \
> register int __e asm("r0"); \
> switch (sizeof(*(__p))) { \
> @@ -142,6 +152,12 @@ extern int __get_user_4(void *);
> case 4: \
> __get_user_x(__r2, __p, __e, __l, 4); \
> break; \
> + case 8: \
> + if (sizeof((x)) < 8) \
> + __get_user_xb(__r2, __p, __e, __l, 4);\
> + else \
> + __get_user_x(__r2, __p, __e, __l, 8);\
> + break; \
> default: __e = __get_user_bad(); break; \
> } \
> x = (typeof(*(p))) __r2; \
... at this point there is a narrowing cast followed by an implicit
widening. This results in compiler either ignoring r3 altogether or, if
spilling to the stack, generating code to set r3 to zero before doing
the store.
I think this approach also makes 8-bit and 16-bit get_user() faster in
some cases where the type of *p and x are similar 8- or 16-bit types.
This is because the compiler will never generate a redundant narrowings.
Note that the speed improvement looks extremely marginal; the size of
the .text section (for a multi_v7_defconfig kernel) only gets 96 bytes
smaller (a.k.a. 0.0015%).
Daniel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists