[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgnmrbQhnXdpi=sY68m4OJff+qSiOUY-L8SF_u8JkHe8A@mail.gmail.com>
Date: Tue, 30 Jul 2024 16:54:43 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Peter Zijlstra <peterz@...radead.org>, Jens Axboe <axboe@...nel.dk>,
Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...hat.com>, Peter Anvin <hpa@...or.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, "the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: Linux 6.11-rc1
On Tue, 30 Jul 2024 at 16:29, Guenter Roeck <linux@...ck-us.net> wrote:
>
> Baffled. Is it possible that the crashing code catches some page boundary ?
We've definitely seen things like that before. Some alignment change
makes something cross a cacheline or page boundary, and it magically
causes a huge regression.
Usually it's about performance, though, not this kind of thing.
But I could imagine that some odd instruction rewriting thing goes
wrong only when the instruction crosses a page boundary, and that
we've never happened to hit that case, and then some kernel config
just moves the affected code around just enough.
That would then indirectly also explain why only some compiler
versions hit it - because it all depends on hitting that exact page
crosser.
You also seemed to say that it only happened with some CPU selections.
Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
looking at that new "nested alternatives macros" thing, and the odd
games we play with the origin and replacement lengths etc.
That all looks entirely crazy. That file was hard to read before, now
it's just incomprehensible to me.
Linus
Powered by blists - more mailing lists