[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201119192708.GW2672@gate.crashing.org>
Date: Thu, 19 Nov 2020 13:27:08 -0600
From: Segher Boessenkool <segher@...nel.crashing.org>
To: David Laight <David.Laight@...lab.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <peterz@...radead.org>,
Florian Weimer <fw@...eb.enyo.de>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Matt Mullins <mmullins@...x.us>,
Ingo Molnar <mingo@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Dmitry Vyukov <dvyukov@...gle.com>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Andrii Nakryiko <andriin@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...omium.org>,
netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Kees Cook <keescook@...omium.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
"linux-toolchains@...r.kernel.org" <linux-toolchains@...r.kernel.org>
Subject: Re: violating function pointer signature
On Thu, Nov 19, 2020 at 05:42:34PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 19 November 2020 16:35
> > I just meant "valid C language code as defined by the standards". Many
> > people want all UB to just go away, while that is *impossible* to do for
> > many compilers: for example where different architectures or different
> > ABIs have contradictory requirements.
>
> Some of the UB in the C language are (probably) there because
> certain (now obscure) hardware behaved that way.
Yes.
> For instance integer arithmetic may saturate on overflow
> (or do even stranger things if the sign is a separate bit).
And some still does!
> I'm not quite sure it was ever possible to write a C compiler
> for a cpu that processed numbers in ASCII (up to 10 digits),
> binary arithmetic was almost impossible.
A machine that really stores decimal numbers? Not BCD or the like?
Yeah wow, that will be hard.
> There are also the CPU that only have 'word' addressing - so
> that 'pointers to characters' take extra instructions.
Such machines are still made, and are programmed in C as well.
> ISTM that a few years ago the gcc developers started looking
> at some of these 'UB' and decided they could make use of
> them to make some code faster (and break other code).
When UB would happen in some situation, the compiler can simply assume
that situation does not happen. This makes it possible to do a lot of
optimisations (many to do with loops) that cannot be done otherwise
(including those to do with signed overflow). And many of those
optimisations are worthwhile.
> One of the problems with UB is that whereas you might expect
> UB arithmetic to generate an unexpected result and/or signal
> it is completely open-ended and could fire an ICBM at the coder.
Yes, UB is undefined behaviour. Unspecified is something else (and C
has that as well, also implementation-defined, etc.)
In some cases GCC (and any other modern compiler) can make UB be IB
instead, with some flag for example, like -fno-strict-* does. In other
cases it isn't so easy at all. In cases like you have here (where the
validity of what you want to do depends on the ABI in effect) things are
not easy :-/
Segher
Powered by blists - more mailing lists