[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf7539e0ccb8f445984fe6dab0d7d8392a79880d.camel@tugraz.at>
Date: Sun, 12 May 2024 21:29:15 +0200
From: Martin Uecker <uecker@...raz.at>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Kees Cook <keescook@...omium.org>, Justin Stitt
<justinstitt@...gle.com>, Peter Zijlstra <peterz@...radead.org>, Mark
Rutland <mark.rutland@....com>, linux-hardening@...r.kernel.org,
linux-kernel@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [RFC] Mitigating unexpected arithmetic overflow
Am Sonntag, dem 12.05.2024 um 09:09 -0700 schrieb Linus Torvalds:
> On Sun, 12 May 2024 at 01:03, Martin Uecker <uecker@...raz.at> wrote:
> >
> > But I guess it still could be smarter. Or does it have to be a
> > sanitizer because compile-time will always have too many false
> > positives?
>
> Yes, there will be way too many false positives.
>
> I'm pretty sure there will be a ton of "intentional positives" too,
> where we do drop bits, but it's very much intentional. I think
> somebody already mentioned the "store little endian" kind of things
> where code like
>
> unsigned chat *p;
> u32 val;
>
> p[0] = val;
> p[1] = val >> 8;
> p[2] = val >> 16;
> p[3] = val >> 24;
>
> kind of code is both traditional and correct, but obviously drops bits
> very much intentionally on each of those assignments.
>
> Now, obviously, in a perfect world the compiler would see the above as
> "not really dropping bits", but that's not the world we live in.
>
> So the whole "cast drops bits" is not easy to deal with.
>
> In the case of the above kind of byte-wise behavior, I do think that
> we could easily make the byte masking explicit, and so in *some* cases
> it might actually be a good thing to just make these things more
> explicit, and write it as
>
> p[0] = val & 0xff;
> p[1] = (val >> 8) & 0xff;
> ...
>
> and the above doesn't make the source code worse: it arguably just
> makes things more explicit both for humans and for the compiler, with
> that explicit bitwise 'and' operation making it very clear that we're
> just picking a particular set of bits out of the value.
Adding versions of the -Wconversions warning which triggers only
in very specific cases should not be too difficult, if something
like this is useful, i.e. restricting the warning to assignments.
>
> But I do suspect the "implicit cast truncates value" is _so_ common
> that it might be very very painful. Even with a run-time sanitizer
> check.
>
> And statically I think it's entirely a lost cause - it's literally
> impossible to avoid in C. Why? Because there are no bitfield
> variables, only fields in structures/unions, so if you pass a value
> around as an argument, and then end up finally assigning it to a
> bitfield, there was literally no way to pass that value around as the
> "right type" originally. The final assignment *will* drop bits from a
> static compiler standpoint.
>
If one wanted to, one could always pass bitfields inside a struct
typedef struct { unsigned int v:12; } b12;
int f(b12 x)
{
int i = x.v;
return i & (1 << 13);
}
the compiler is then smart enough to know how many bits are
relevant and track this to some degree inside the function.
https://godbolt.org/z/o8P3adnEK
But using this information for warnings would be more difficult
because the information is not computed in the front end. (but
here also other warnings generated by the backend, so not
impossible). And, of course, the additional wrapping and
unwrapping makes the code more ugly (*)
C23 then also has bit-precise integers.
Martin
(*) ...but compared to what some other languages require the
programmer to write, even this seems relatively benign.
Powered by blists - more mailing lists