lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <552B1CDB.9040803@openwall.com> Date: Mon, 13 Apr 2015 04:33:15 +0300 From: Alexander Cherepanov <ch3root@...nwall.com> To: discussions@...sword-hashing.net Subject: Re: [PHC] On type aliasing and similar issues On 2015-04-10 17:19, Solar Designer wrote: > On Fri, Apr 10, 2015 at 03:59:05PM +0300, Alexander Cherepanov wrote: >> The direct use of member names is relatively clear -- it's alllowed and >> it's plainly spelled out in a footnote in 6.5.2.3p3 (C99 and C11). The >> use through pointers is also relatively clear -- it's prohibited, which >> is plainly spelled out in gcc doc[1]. > [...] >> Everything becomes more complicated when a member of a union is an >> array. It's somewhat in-between these two cases and I'm not sure how >> it's supposed to be treated. > > What about uses like this: This question turned out to be surprisingly difficult. After a lot of reading it seems I got some understanding what's going on. 1. I've got some interpretation of relevant parts of the C standard (as written) which is quite simple and hopefully non-self-contradicting. 2. This interpretation is not what the Committee intended. The problem is that the Committee didn't write what it wants and probably don't yet know what it wants exactly. The sore state of affairs is perfectly demonstrated by the Defect Report #236 [1] submitted 2000-10-18. The first example in this DR is allowed by the standard (opinion of the reporter and my opinion too) but DR is closed saying that "Both programs invoke undefined behavior" without much further explanations. I would be scratching my head about what it means for a long time but there are many discussions of this DR and the last one[2] states (2010-10-08): "In 2005-04 (Lillehammer), the committee gave up waiting for the words to materialize, instead deciding simply to state the committee's intention in the DR response, without worrying about whether that intention was accurately described by the standard." It seems there is not much progress in this area during last 15 years, including with the release of C11 standard. [1] http://open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm [2] http://open-std.org/jtc1/sc22/wg14/www/docs/n1520.htm 3. GCC has its own rules which are more strict than the C standard. That is some strictly conforming programs are miscompiled. There is a thread[3] which discusses the question very similar to your one quoted below. Good explanation is in [4], it ends with this: "the original poster is correct that GCC doesn't implement C99 aliasing as written in the standard regarding unions. We don't do so because we determined that this can't possibly have been the intent of the standard as it makes type-based aliasing relatively useless." [3] https://gcc.gnu.org/ml/gcc/2010-01/threads.html#00013 [4] https://gcc.gnu.org/ml/gcc/2010-01/msg00263.html 4. AFAIK GCC rules are not documented except for [5]. But I think I've got some idea about what they want. There is some hope that it's not self-contradicting:-) [5] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning BTW regarding your idea of visibility of unions inside a function. It was proposed[6] and kinda rejected[7] in discussion in gcc mailing list. [6] http://open-std.org/jtc1/sc22/wg14/www/docs/n1090.htm [7] https://gcc.gnu.org/ml/gcc/2004-12/msg00164.html > typedef union { > struct { uint32_t a1, a2; } a; > uint64_t b; > } any_t; > > void copy2(uint32_t *dst, uint32_t *src) > { > ((any_t *)dst)->b = ((any_t *)src)->b; > } > > There's no access to members of the union through a pointer (nor even > through an array), but there's expected to be access through uint32_t * > pointers in the caller of copy2(). Would a compiler inlining copy2() be > guaranteed to do what the programmer expected (copy two 32-bit values, > potentially faster and assuming 64-bit alignment)? According to (my understanding of) the C standard: it's ok when dst and src happen to be aligned as required for uint64_t, undefined behavior at pointer conversion otherwise. gcc 4.9.1 on my x86_64 GNU/Linux shows _Alignof(uint32_t) == 4 and _Alignof(uint64_t) == 8. IOW: not ok. GCC: never ok because there is no object of type any_t where dst or src point to. > Or with the opposite uses of the two integer types: > > void add32x2(uint64_t *dst, uint64_t *src) > { > ((any_t *)dst)->a.a1 += ((any_t *)src)->a.a1; > ((any_t *)dst)->a.a2 += ((any_t *)src)->a.a2; > } > > where the caller is expected to access through uint64_t * pointers. C standard: ok (assuming _Alignof(any_t) == _Alignof(uint64_t) >= _Alignof(uint32_t)). GCC: never ok because there is no object of type any_t where dst or src point to. > (Of course, this example is sensitive to byte order - or rather, to the > order of 32-bit halves in a 64-bit word.) The order of 32-bit halves in a 64-bit word is probably not important in your example. The fact that halves from the POV of logical bits are the same as halves from the POV of storage is. AFAIU location of specific bits of uint64_t inside 8 bytes is not specified. >> Side note: not much have changed between C89 and C99 in this question. >> Accessing a wrong member in a union is an implementation-defined >> behavior in C89 but a footnote in 3.3.2.3 implies that the reason for >> this is indeterminate byte order. OTOH this behavior is defined in C99 >> but the byte order is still not specified. Hence a strictly conforming >> program shouldn't use it anyway. > > There are many use cases where byte order does not matter, such as when > implementing a maybe-faster memset() or memcpy() alike that would use a > wider data type (as long as alignment and total size permit). GCC has[1] "may_alias" type attribute for such cases. [1] https://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html#index-g_t_0040code_007bmay_005falias_007d-type-attribute-3372 -- Alexander Cherepanov
Powered by blists - more mailing lists