linux-kernel - Re: [GIT PULL] Crypto Update for 6.10

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=whRK3wdMoUd7UOL-R8GcfwYYLsy26ft-kHv5WaofyYe=Q@mail.gmail.com>
Date: Tue, 14 May 2024 10:07:02 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Lukas Wunner <lukas@...ner.de>
Cc: Herbert Xu <herbert@...dor.apana.org.au>, "David S. Miller" <davem@...emloft.net>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, 
	Linux Crypto Mailing List <linux-crypto@...r.kernel.org>, Julia Lawall <julia.lawall@...ia.fr>, 
	Nicolas Palix <nicolas.palix@...g.fr>, cocci@...ia.fr
Subject: Re: [GIT PULL] Crypto Update for 6.10

On Mon, 13 May 2024 at 23:54, Lukas Wunner <lukas@...ner.de> wrote:
>
> On Mon, May 13, 2024 at 03:12:53PM -0700, Linus Torvalds wrote:
> >
> > > https://lore.kernel.org/all/202404252210.KJE6Uw1h-lkp@intel.com/
> >
> > looks *very* much like the cases we've seen with clang in particular
> > where clang goes "this code isn't reachable, so I'll just drop
> > everything on the floor", and then it just becomes a fallthrough to
> > whatever else code happens to come next. Most of the time that's just
> > more (unrelated) code in the same function, but sometimes it causes
> > that "falls through to next function" instead, entirely randomly
> > depending on how the code was laid out.
>
> Curiously, this particular 0-day report is for gcc 13.2.0 though,
> not clang.

Hmm. I think all the previous reports of "falls through to next
function" that I have seen have been with clang, but that is probably
be selection bias: the gcc cases of this tend to be found so much more
quickly (because gcc is still more common at least on x86) that by the
time I see the reports, it's because of some clang issue.

And in fact, when I go test this theory by going to search on lore, I
do see several gcc reports.

So no, it was never just clang-only, it was just that the ones I had
looked at were about clang.

> The assume() macro had no effect with clang when I tested it.

I suspect that the issue is that with *normal* kernel configurations,
the code generation is simple and straightforward enough that gcc did
the right thing.

And then some more complicated setup with more debugging support
enabled (particularly things like UBSAN or KASAN) the code gets
complicated enough that gcc doesn't do the optimization any more, and
then the conditional in assume() doesn't get optimized away at an
early stage any more, and remains as a conditional branch to
la-la-land.

And you actually don't even see this as a warning unless the
la-la-land happens to be at the end of a function. IOW, the "branch to
nowhere" _could_ just branch to some label inside the function, and
the objtool sanity check would never even have triggered.

That's why "unreachable()" can be so dangerous. It tells the compiler
that code generation in one place no longer matters, and then the
compiler can decide to leave things just dangling in odd ways.

The code presumably still *works* - because the actual conditional
never triggers, so in that sense it's safe and fine. But it's still
just horrendous to try to figure out, which is why I was so down on
it.

              Linus