[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.0.9999.0712171009570.21557@woody.linux-foundation.org>
Date: Mon, 17 Dec 2007 10:28:38 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Eric Dumazet <dada1@...mosbay.com>
cc: security@...nel.org, tytso@....edu,
Herbert Xu <herbert@...dor.apana.org.au>,
John Reiser <jreiser@...Wagon.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-sparse@...r.kernel.org, mpm@...enic.com,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [Security] Signed divides vs shifts (Re: /dev/urandom uses uninit
bytes, leaks user data)
On Mon, 17 Dec 2007, Eric Dumazet wrote:
>
> while
>
> long *mid(long *a, long *b)
> {
> return ((a - b) / 2u + a);
> }
This is exactly what I'm talking about. That "2u" is TOTALLY POINTLESS.
It's an "unsigned int", but since (a-b) will be of type ptrdiff_t, and is
*wider* on a 64-bit architecture (it's the same as "long" on x86-64), then
the 2u will just be converted to "long", and be signed again!
So you thought that you did an unsigned divide, but you did no such thing.
If you change the "2u" to a "2ul", it works again, and you get
mid:
movq %rdi, %rax
subq %rsi, %rax
sarq %rax
andq $-8, %rax
addq %rdi, %rax
ret
which is the code you wanted. But quite frankly, you could just have
written it with a shift to start with, and avoided the subtle type issue,
although gcc then generates
movq %rdi, %rax
subq %rsi, %rax
sarq $4, %rax
leaq (%rdi,%rax,8), %rax
ret
instead. Of course, this all *does* still have subtle sign issues, because
the "a-b" part implies a signed divide in itself, which is why you see
that "sarq" in he first place (rather than a "shrq").
Signed divides are hard. The "a-b" pointer subtraction is actually cheaper
than a general signed divide by sizeof, since the compiler can then assume
that the two pointers are mutually aligned, which is why gcc can generate
just a single "sarq" instead of having to do an extra "add negative bit"
thing to get the rounding right.
[ So Al, when you said that
(a-b)
is equivalent to
((char *)a-(char *)b)/4
for a "int *" a and b, you're right in the sense that the *result* is
the same, but the code generation likely isn't. The "a-b" thing can (and
does) allow the compiler to avoid the whole "align up for signed
numbers" thing, and the difference in code generation is clear:
subq %rsi, %rdi
sarq $2, %rdi
vs
subq %rsi, %rdi
leaq 3(%rdi), %rax
testq %rdi, %rdi
cmovs %rax, %rdi
sarq $2, %rdi
exactly because the first case *knows* that the low two bits have to be
zero, and thus there is no rounding issue. ]
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists