lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5527C919.1080601@openwall.com>
Date: Fri, 10 Apr 2015 15:59:05 +0300
From: Alexander Cherepanov <ch3root@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] On type aliasing and similar issues

On 2015-04-08 17:05, Solar Designer wrote:
> On Wed, Apr 08, 2015 at 07:25:29AM +0300, Alexander Cherepanov wrote:
>> On 2015-04-08 06:03, Samuel Neves wrote:
>>> On 04/08/2015 03:37 AM, Alexander Cherepanov wrote:
>>>> AFACT this is implementation-defined in C89 (3.3.2.3) and fully defined
>>>> in C99 and C11 (6.5.2.3p3).
>>>
>>> Yes, type punning with unions is now OK (though implementation-defined;
>>> accessing the wrong member may still trap) in
>>
>> uint32_t[2] and uint64_t don't have padding bits and have the same size
>> so this particular example is fully defined.
>
> FWIW, icc 14.0.0 miscompiles the code I have in php_mt_seed 3.2 if I
> remove the "volatile" workaround:
>
> #ifdef __ICC
> 			volatile
> #endif
> 			union {
> 				vtype v;
> 				uint32_t s[sizeof(vtype) / 4];
> 			} u[8], uM[8];
>
> where vtype is e.g.:
>
> typedef __m128i vtype;
>
> and the union members are only accessed directly, not via pointers,
> although indeed there are uses like u[i].s[j] (so with non-constant
> index for the .s[] array).

That's a moot point indeed and I didn't fully appreciate the fact that 
one of the members in the unions in the previous examples is an array. I 
was thinking about cases more like this:

union {
   struct { uint32_t x0, x1; } x;
   uint64_t y;
} v;

The direct use of member names is relatively clear -- it's alllowed and 
it's plainly spelled out in a footnote in 6.5.2.3p3 (C99 and C11). The 
use through pointers is also relatively clear -- it's prohibited, which 
is plainly spelled out in gcc doc[1]. It's not entirely clear if it 
follows from the C standard but gcc approach seems to be permitted by 
the standard. And, e.g., "Aliasing restrictions of C11 formalized in 
Coq"[2] follows the gcc doc:

   "In order to enable type-based alias analysis, we have to ensure that 
only under certain conditions a union can be read using a pointer to 
another variant than the current one (this is called type-punning [6, 
6.5.2.3]). Since the C11 standard is unclear about these conditions2 , 
we follow the GCC documentation [4] on it. It states that “type-punning 
is allowed, provided the memory is accessed through the union type”."

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning
[2] http://robbertkrebbers.nl/research/articles/aliasing.pdf

Everything becomes more complicated when a member of a union is an 
array. It's somewhat in-between these two cases and I'm not sure how 
it's supposed to be treated.

> I thought this was an icc bug, but maybe I'm wrong?

First of all, which version of C does icc target? In C89, accessing a 
union through a wrong member is implementation-defined behavior. Does 
icc document it? gcc does it here:

https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html

Side note: not much have changed between C89 and C99 in this question. 
Accessing a wrong member in a union is an implementation-defined 
behavior in C89 but a footnote in 3.3.2.3 implies that the reason for 
this is indeterminate byte order. OTOH this behavior is defined in C99 
but the byte order is still not specified. Hence a strictly conforming 
program shouldn't use it anyway.

> Full context: http://www.openwall.com/php_mt_seed/
>
>>> C99 and above. What is being dereferenced in the example is the pointer to
>>> the union, not the members, so I'm not sure
>>> strict aliasing's undefined behavior applies. The example could be further
>>> improved to demonstrate this:
>>>
>>>    #include <stdint.h>
>>>    #include <stdio.h>
>>>
>>>    union U {
>>>      uint32_t x[2];
>>>      uint64_t y;
>>>    };
>>>
>>>    extern union U * v;
>>>
>>>    void f() {
>>>      uint32_t * p = &v->x[0];
>>>      uint64_t * q = &v->y;
>>>      *p = 17;
>>>      *q = 42;
>>>      printf("%u\n", v->x[0]);
>>>    }
>>>
>>> While Clang and recent GCC do what one would hope (print 42), GCC 3.4 and
>>> Intel compiler print 17.
>>
>> Yes, it seems the standard intends to permit access to wrong members
>> only via . and -> operators. You cannot access them in a less direct
>> way.
>
> In my php_mt_seed example, is accessing .s[j] standards compliant or
> not?  If not, that's really unfortunate.

I don't know, sorry.

> What about e.g., .s[3] (constant index)?  We could want to narrow down
> icc's behavior - whether the problem occurs only with variable or also
> with constant indices, or maybe even without an array at all.  I haven't
> tried yet.

If you can eliminate the use of non-constant indexes then you can 
replace everything with non-array variables as well. Perhaps this is the 
easiest way in your case.

Another possible way is to pipe values through an intermediate struct:

struct s {
   uint32_t s[sizeof(vtype) / 4];
} u[8], uM[8];

union u {
   vtype v;
   struct s s;
} tmp;

tmp.v = a; u[0] = tmp.s;
...

Or without an explicit variable:

u[0] = ((union u){a}).s;

>> If you pass them to another function then recent gcc -O2 will print
>> 17 too.
>
> This is not surprising.  However, I think behavior within one function,
> where having derived a pointer from a union member is clearly visible to
> the compiler, could be defined.

The gcc doc ([1] above) specifically warns against it:

   "
           union a_union {
             int i;
             double d;
           };

   [skip]
   However, this code might not [work as expected]:

           int f() {
             union a_union t;
             int* ip;
             t.d = 3.0;
             ip = &t.i;
             return *ip;
           }
   "

> Does any recent C standard say anything about the special case of
> accessing union members via pointers within the same function?

I don't think so.

-- 
Alexander Cherepanov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ