lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CA+55aFzU8tRSuCoS2TmXjGZ5WRUzMCrG6Jk4eNCQZjP+5sVE9w@mail.gmail.com> Date: Tue, 26 Dec 2017 18:16:37 -0800 From: Linus Torvalds <torvalds@...ux-foundation.org> To: Alexandru Chirvasitu <achirvasub@...il.com> Cc: Andy Lutomirski <luto@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, kernel list <linux-kernel@...r.kernel.org>, Borislav Petkov <bp@...en8.de>, Brian Gerst <brgerst@...il.com>, Denys Vlasenko <dvlasenk@...hat.com>, "H. Peter Anvin" <hpa@...or.com>, Josh Poimboeuf <jpoimboe@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...nel.org> Subject: Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot On Tue, Dec 26, 2017 at 3:19 PM, Alexandru Chirvasitu <achirvasub@...il.com> wrote: > > I went back to the initial problematic commit e802a51 and modified it as you suggest: Thank you. > This did not work out for me, but now it fails differently. Both > (kexec -l + kexec -e) and (kexec -p + echo c > /proc/sysrq-trigger) > end in call traces and freezes. > > It does seem to be tied to idt_invalidate. One of the last things I > see on the screen (which is ends up frozen with the computer inactive) > is > > EIP: idt_invalidate+0x6/0x40 SS:ESP: 0068:f6c47cd0 Yes, interesting, it's the stack canary load access there: mov %gs:0x14,%edx that traps. And that actually makes a lot of sense: the load_segments() call just above has rloaded all segments with __KERNEL_DS. So while the stack canary access *intends* to load it from the magic stack canary segment (offset 0x14), we've just reset all segments to the standard zero-based full-sized ones, and obviously that will take a page fault at 0x14. And the reason you now actually *see* the page fault is that we haven't completely buggered the CPU state now, so the trap handler actually works. With the GDT reset before, it used to take that same trap, but now the trap handler itself would fault, and cause a triple fault - which resets the machine. So it wasn't actually tracing, it was the stack canary all along. So at least it's truly root-caused now. But the fix is the same: we just can't afford to do any function calls. Alternatively, we should just fix that insane "load_segments()". I'm not sure why the code insists on reloading the segments in the first place. So you could try just to remove the "load_segments()" line entirely. Thanks for spending the time testing things out, Linus
Powered by blists - more mailing lists