linux-kernel - Re: [patch V6 04/37] x86: Make hardware latency tracing explicit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVuA25n_d-3KMvvDxuqZeBEEYb6n=QAXOhBFkgS1Dk+UA@mail.gmail.com>
Date:   Sun, 17 May 2020 22:50:56 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Alexandre Chartre <alexandre.chartre@...cle.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        Brian Gerst <brgerst@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Will Deacon <will@...nel.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        Wei Liu <wei.liu@...nel.org>,
        Michael Kelley <mikelley@...rosoft.com>,
        Jason Chen CJ <jason.cj.chen@...el.com>,
        Zhao Yakui <yakui.zhao@...el.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>
Subject: Re: [patch V6 04/37] x86: Make hardware latency tracing explicit

On Sun, May 17, 2020 at 1:48 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Andy Lutomirski <luto@...nel.org> writes:
> > On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >>
> >>
> >> The hardware latency tracer calls into trace_sched_clock and ends up in
> >> various instrumentable functions which is problemeatic vs. the kprobe
> >> handling especially the text poke machinery. It's invoked from
> >> nmi_enter/exit(), i.e. non-instrumentable code.
> >>
> >> Use nmi_enter/exit_notrace() instead. These variants do not invoke the
> >> hardware latency tracer which avoids chasing down complex callchains to
> >> make them non-instrumentable.
> >>
> >> The real interesting measurement is the actual NMI handler. Add an explicit
> >> invocation for the hardware latency tracer to it.
> >>
> >> #DB and #BP are uninteresting as they really should not be in use when
> >> analzying hardware induced latencies.
> >>
> >
> >> @@ -849,7 +851,7 @@ static void noinstr handle_debug(struct
> >>  static __always_inline void exc_debug_kernel(struct pt_regs *regs,
> >>                                              unsigned long dr6)
> >>  {
> >> -       nmi_enter();
> >> +       nmi_enter_notrace();
> >
> > Why can't exc_debug_kernel() handle instrumentation?  We shouldn't
> > recurse into #DB since we've already cleared DR7, right?
>
> It can later on. The point is that the trace stuff calls into the world
> and some more before the entry handling is complete.
>
> Remember this is about ensuring that all the state is properly
> established before any of this instrumentation muck can happen.
>
> DR7 handling is specific to #DB and done even before nmi_enter to
> prevent recursion.

So why is this change needed?