[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFy0jPr5udwJx8cZ-QFs-OgZzQ_NJm_VSj_cS7tsKUn3Kw@mail.gmail.com>
Date: Mon, 22 Apr 2013 17:23:04 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
David Miller <davem@...emloft.net>,
"Theodore Ts'o" <tytso@....edu>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Network Development <netdev@...r.kernel.org>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: Unsigned widening casts of binary "not" operations..
[ Ugh, resending because I had mistakenly set the html bit and all the
mailing lists just refused the original... ]
So I was playing with sparse again, because the MIPS people had hit a
bug with the fact that they had made PAGE_MASK an *unsigned* type, and
then doing the ~ (binary not) operation on it does the wrong thing
when you operate on bigger types, like hardware 36-bit physical
addresses.
See commit 3b5e50edaf50 (and commit c17a6554 that it reverts).
So the issue is that let's say that you have a constant (or variable,
for that matter) that is of type "unsigned int", and then you use that
to mask a variable of a larger size (say it's an "unsigned long" on a
64-bit arch, or it's a phys_addr_t on a 32-bit arch with PAE). So the
code looks basically like a variation of something like this:
u64 value = ...;
value &= ~bitmask;
What happens?
What happens is that the "~bitmask" is done in the *narrower* type,
and then - because the narrower type is unsigned - the cast to the
wider type is done as an *unsigned* cast, so what you *think* happens
is that it clears the bits that are set in "bitmask", but what
*actually* happens is that yes, you clear the bits that are set in
:bitmask", but you *also* clear the upper bits of value.
Now, why am I posting about this MIPS-specific small detail on to x86 people?
Because I hacked up a sparse patch that looks for the pattern of
unsigned casts of a (bit) negation to a wider type, and there's a fair
number of them.
Now, it may well be that my sparse hack (and it really is a hack) is a
broken piece of crap, but from a quick look the warnings it gives look
reasonable.
And there's quite a lot of them. Even in my (fairly small) config I
use on my desktop. And the first warnings I see are in x86 code:
arch/x86/kernel/traps.c:405:16: warning: implicit unsigned
widening cast of a '~' expression
arch/x86/kernel/cpu/perf_event_p4.c:912:15: warning: implicit
unsigned widening cast of a '~' expression
that particular one is something that has an *explicit* cast to "u64",
and then it does binary 'and' operations with these things that are of
a 32-bit unsigned type with a binary not in front of them. IOW, the
types in that expression are *very* confused.
Here's a ext4 code snippet that looks like an actual bug (but seems to
only hit read-ahead):
ext4_fsblk_t b, block;
b = block & ~(EXT4_SB(sb)->s_inode_readahead_blks-1);
where "b" actually ends up having the upper bits cleared, because the
s_inode_readahead_blks thing is an unsigned int, so you're masking off
not just the low bits, but the high bits too. Ted? Of course, it's
just read-ahead, so it probably doesn't matter, but.
We have a number of generic code examples where this kind of thing
seems harmless:
*vm_flags &= ~VM_MERGEABLE;
(we only have flags in the low 32 bits, and the only reason we get
warnings for VM_MERGEABLE is because 0x80000000 is an implicitly
unsigned constant, while 0x40000000 is *not*), or
kernel/trace/trace.c:2910:32: warning: implicit unsigned widening
cast of a '~' expression
where "trace_flags" is "unsigned long" and "mask" is "unsigned int",
and the expression
trace_flags &= ~mask;
actually clears the upper 32 bits too, but presumably they are always
clear anyway so we don't *really care. But it's an example of code
that is potentially very subtly dangerous.
Now, a lot of the other warnings I get seem to be more benign - the
networking code results in a lot of these because of code like this:
#define NLMSG_ALIGNTO 4U
#define NLMSG_ALIGN(len) ( ((len)+NLMSG_ALIGNTO-1) & ~(NLMSG_ALIGNTO-1) )
which technically has the same problem if "len" is 64-bit (which
sizeof() is on x86-64), but we don't really *care* because it turns
out that the sizeof values will always have the high bits clear
*anyway*. So I'm not sure the hacky sparse warning is useful, because
my code isn't smart enough to figure out when this kind of widening
cast is a problem, and when it isn't.
That said, I'm cc'ing David and netdev too, just in case. There is
likely some *reason* why it uses an "unsigned int" for a constant that
is then commonly expanded with a binary "not" and the upper bits end
up being surprising. So this thing doesn't look like a bug, but it
does cause these warnings:
net/netlink/af_netlink.c:1889:38: warning: implicit unsigned
widening cast of a '~' expression
and maybe the networking people care about this and maybe they don't.
(It turns out that any use of "UINT_MAX" for a "long" value also
results in this warning, because we define it as "~0ul", so there are
other cases of spurious things where we *intentionally* drop the high
bits).
ANYWAY.
Only a few of the warnings looked like they might be bugs, but I'm
attaching my sparse patch here in case somebody wants to play with the
hacky thing..
Linus
Download attachment "sparse.diff" of type "application/octet-stream" (3804 bytes)
Powered by blists - more mailing lists