linux-kernel - Re: [RFC][PATCH] tracing/x86: Save CR2 before tracing irqsoff on error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190321083317.GL6058@hirez.programming.kicks-ass.net>
Date:   Thu, 21 Mar 2019 09:33:17 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Andy Lutomirski <luto@...capital.net>,
        Joel Fernandes <joel@...lfernandes.org>,
        He Zhe <zhe.he@...driver.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC][PATCH] tracing/x86: Save CR2 before tracing irqsoff on
 error_entry

On Wed, Mar 20, 2019 at 10:15:34PM -0400, Steven Rostedt wrote:

> And it would crash similarly each time I tried it, but always at a
> different place. After spending the day on this, I finally figured it
> out. The bug is happening in entry_64.S right after error_entry.
> There's two TRACE_IRQS_OFF in that code path, which if I comment out,
> the bug goes away. Then it dawned on me that the crash always happens
> when systemd does a normal page fault. We had this bug before, and it
> was with the exception trace points.

0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when tracing page faults")
d4078e232267 ("x86, trace: Further robustify CR2 handling vs tracing")

Or were you talking about:

70fb74a5420f ("x86: Save cr2 in NMI in case NMIs take a page fault (for i386)")

> The issue is that a tracepoint can fault (reading vmalloc or whatever).
> And doing a userspace stack trace most definitely will fault. But if we
> are coming from a legitimate page fault, the address of that fault (in
> the CR2 register) will be lost if we fault before we get to the page
> fault handler. That's exactly what is happening.

Shees, that could've been written much clearer. So you're saying:

idtentry page_fault             do_page_fault           has_error_code=1
  call error_entry
    TRACE_IRQS_OFF
      call trace_hardirqs_off*
        <tracer stuff>
	  <fault> # modifies CR2
  call do_page_fault
    address = read_cr2(); /* whoopsie */

Right?

> To solve this, a TRACE_IRQS_OFF_CR2 (and ON for consistency) was added
> that saves the CR2 register. A new trace_hardirqs_off_thunk_cr2 is
> created that stores the cr2 register, calls the
> trace_hardirqs_off_caller, then on return restores the cr2 register if
> it changed, before returning.

Yuck.. also, not consistent with the actual patch. The thunk doesn't
save/restore CR2.

I really hate making this special TRACE_IRQS_OFF_CR2 thing, it feels far
too fragile. I'd _much_ rather push the #PF CR2 read much earlier.

Also, argh I fscking hate context tracking. That makes all this so much
more complicated. It if weren't for CALL_enter_from_user_mode, we could
pull that TRACE_IRQS_OFF out of error_entry.

Damn... Andy, any bright ideas?