linux-kernel - Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJWu+opRDiaL+hRbKYDp-kt-DiRQzu6Y1ceugHSURCsSN+H-aA@mail.gmail.com>
Date:   Wed, 25 Apr 2018 19:18:10 -0700
From:   Joel Fernandes <joelaf@...gle.com>
To:     Paul McKenney <paulmck@...ux.vnet.ibm.com>
Cc:     Namhyung Kim <namhyung@...nel.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-rt-users <linux-rt-users@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Zilstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Tom Zanussi <tom.zanussi@...ux.intel.com>,
        Thomas Glexiner <tglx@...utronix.de>,
        Boqun Feng <boqun.feng@...il.com>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Fenguang Wu <fengguang.wu@...el.com>,
        Baohong Liu <baohong.liu@...el.com>,
        Vedang Patel <vedang.patel@...el.com>,
        kernel-team <kernel-team@....com>
Subject: Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can

On Sun, Apr 22, 2018 at 8:19 PM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> On Sun, Apr 22, 2018 at 06:14:18PM -0700, Joel Fernandes wrote:
[...]
>> I narrowed the performance hit down to the call to
>> rcu_irq_enter_irqson() and rcu_irq_exit_irqson() in __DO_TRACE.
>> Commenting these 2 functions brings the perf level back.
>>
>> I was thinking about RCU usage here, and really we never change this
>> particular performance-sensitive tracepoint's function table 99.9% of
>> the time, so it seems there's quite in a win if we just had another
>> read-mostly synchronization mechanism that doesn't do all the RCU
>> tracking that's currently done here and such a mechanism can be
>> simpler..
>>
>> If I understand correctly, RCU also adds other complications such as
>> that it can't be used from the idle path, that's why the
>> rcu_irq_enter_* was added in the first place. Would be nice if we can
>> just avoid these RCU calls for the preempt/irq tracepoints... Any
>> thoughts about this or any other ideas to solve this?
>
> In theory, the tracepoint code could use SRCU instead of RCU, given that
> SRCU readers can be in the idle loop, although at the expense of a couple
> of smp_mb() calls in each tracepoint.  In practice, I must defer to the
> people who know the tracepoint code better than I.

Paul and me were chatting about handling of tracing from an NMI. If
the tracepoint's implementation were to be switched to using SRCU
instead of RCU, a complication could arise due to the use of
this_cpu_inc from srcu_read_lock.

int __srcu_read_lock(struct srcu_struct *sp)
{
        int idx;

        idx = READ_ONCE(sp->srcu_idx) & 0x1;
        this_cpu_inc(sp->sda->srcu_lock_count[idx]);
        smp_mb(); /* B */  /* Avoid leaking the critical section. */
        return idx;
}
EXPORT_SYMBOL_GPL(__srcu_read_lock);

What could happen is if an NMI preempts the this_cpu_inc, and also
happens to call a tracepoint from the NMI handler, then this could
result in a lost-update issue on architectures that don't support
add-to-memory instructions. Paul said he wouldn't want to use atomics
to resolve this inorder to keep the srcu overhead low.

One way we discussed to resolve this could be to use a different
srcu_struct for NMI invocations, so that the above lost update doesn't
occur. We could use in_nmi() and switch the srcu_read_lock to use the
NMI version of the srcu_struct. Another way could be to just warn for
now if the srcu version of the trace_ API was used from NMI. This
could be fragile if some code path indirect results in a tracepoint
call so we should probably handle it by detecting and using the
correct srcu_struct for the srcu_read_lock.

thanks,

 - Joel