lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 24 Jul 2015 13:06:39 +0200 From: Peter Zijlstra <peterz@...radead.org> To: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Andy Lutomirski <luto@...capital.net>, X86 ML <x86@...nel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Willy Tarreau <w@....eu>, Borislav Petkov <bp@...en8.de>, Thomas Gleixner <tglx@...utronix.de>, Steven Rostedt <rostedt@...dmis.org>, Brian Gerst <brgerst@...il.com> Subject: Re: Dealing with the NMI mess On Thu, Jul 23, 2015 at 02:54:54PM -0700, Linus Torvalds wrote: > On Thu, Jul 23, 2015 at 2:45 PM, Andy Lutomirski <luto@...capital.net> wrote: > > > > Or we just re-enable them on the way out of NMI (i.e. the very last > > thing we do in the NMI handler). I don't want to break regular > > userspace gdb when perf is running. > > I'd really prefer it if we don't touch NMI code in those kinds of > ways. The NMI code is fragile as hell. All the problems we have with > it is exactly due to "where is the boundary" issues. > > That's why I *don't* want NMI code to do magic crap. Anything that > says "disable this during this magic window" is broken. The problems > we've had are exactly about atomicity of the entry/exit conditions, > and there is no really good way to get them right. > > I'd be much happier with a _TIF_USER_WORK_MASK approach exactly > because it's so *obvious* that it's not a boundary condition. > > I dislike the "disable and re-enable dr7 in the NMI handler" exactly > because it smells like "we can only handle faults in _this_ region". > It may be true, but it's also what I want us to get away from. I'd > much rather have the "big picture" be that we can take faults anywhere > at all (*), and that none of the core code really cares. Then we "fix > up" user space. A wee bit something like so? We need the intermediate self-IPI because NMI/MCE etc do not deal with TIF flags. I further cleared all of DR7 in an attempt at reducing the amount of state tracked. And it doesn't distinguish between kernel/user watchpoints because the kernel can touch both from !IF. --- arch/x86/kernel/traps.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 8e65d8a9b8db..e8308e9c2b1e 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -570,6 +570,33 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s) NOKPROBE_SYMBOL(fixup_bad_iret); #endif +struct do_debug_state { + unsigned long dr7; + struct irq_work irq_work; + struct callback_head task_work; +}; + +static void __debug_irq_trampoline(struct irq_work *work) +{ + struct do_debug_state *dds = + container_of(work, struct do_debug_state, irq_work); + + task_work_add(current, &dds->task_work, true); +} + +static void __debug_restore_dr7(struct callback_head *work) +{ + struct do_debug_state *dds = + container_of(work, struct do_debug_state, task_work); + + set_debugreg(dds->dr7, 7); +} + +static DEFINE_PER_CPU(struct do_debug_state, do_debug_state) = { + .irq_work = { .func = __debug_irq_trampoline, }, + .task_work = { .func = __debug_restore_dr7, }, +}; + /* * Our handling of the processor debug registers is non-trivial. * We do not clear them on entry and exit from the kernel. Therefore @@ -603,6 +630,16 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code) ist_enter(regs); + if (arch_irqs_disabled_flags(regs->flags)) { + struct do_debug_state *dds = this_cpu_ptr(&do_debug_state); + + get_debugreg(dds->dr7, 7); + set_debugreg(0, 7); + irq_work_queue(&dds->irq_work); + + goto exit; + } + get_debugreg(dr6, 6); /* Filter out all the reserved bits which are preset to 1 */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists