[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201105211739.568279-1-axelrasmussen@google.com>
Date: Thu, 5 Nov 2020 13:17:38 -0800
From: Axel Rasmussen <axelrasmussen@...gle.com>
To: Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Michel Lespinasse <walken@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
Jann Horn <jannh@...gle.com>,
Chinwen Chang <chinwen.chang@...iatek.com>,
Davidlohr Bueso <dbueso@...e.de>,
David Rientjes <rientjes@...gle.com>,
Laurent Dufour <ldufour@...ux.ibm.com>
Cc: Yafang Shao <laoar.shao@...il.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Axel Rasmussen <axelrasmussen@...gle.com>
Subject: [PATCH v6 0/1] mmap_lock: add tracepoints around lock acquisition
This patchset adds tracepoints around mmap_lock acquisition. This is useful so
we can measure the latency of lock acquisition, in order to detect contention.
This version is based on v5.10-rc2.
Changes since v5:
- Michel pointed out that rwsem_release in mmap_read_trylock_non_owner doesn't
actually release the lock, it just releases lockdep's ownership tracking. So,
it's incorrect to call __mmap_lock_trace_released there, so the call has been
removed.
Changes since v4:
- Redesigned buffer allocation to deal with the fact that a trace event might be
interrupted by e.g. an IRQ, for which a per-cpu buffer is insufficient. Now we
allocate one buffer per CPU * one buffer per context we might be called in
(currently 4: normal, irq, softirq, NMI). We have three trace events which can
potentially all be enabled, and all of which need a buffer; to avoid further
multiplying the number of buffers by 3, they share the same set of buffers,
which requires a spinlock + counter setup so we only allocate the buffers
once, and then free them only when *all* of the trace events are _unreg()-ed.
Changes since v3:
- Switched EXPORT_SYMBOL to EXPORT_TRACEPOINT_SYMBOL, removed comment.
- Removed redundant trace_..._enabled() check.
- Defined the three TRACE_EVENTs separately, instead of sharing an event class.
The tradeoff is 524 more bytes in .text, but the start_locking and released
events no longer have a vestigial "success" field, so they're simpler +
faster.
Changes since v2:
- Refactored tracing helper functions so the helpers are simper, but the locking
functinos are slightly more verbose. Overall, this decreased the delta to
mmap_lock.h slightly.
- Fixed a typo in a comment. :)
Changes since v1:
- Functions renamed to reserve the "trace_" prefix for actual tracepoints.
- We no longer measure the duration directly. Instead, users are expected to
construct a synthetic event which computes the interval between "start
locking" and "acquire returned".
- The new helper for checking if tracepoints are enabled in a header is used to
avoid un-inlining any of the lock wrappers. This yields ~zero overhead if the
tracepoints aren't enabled, and therefore obviates the need for a Kconfig for
this change.
[1] https://lore.kernel.org/patchwork/patch/1316922/
[2] https://lore.kernel.org/patchwork/patch/1311996/
Axel Rasmussen (1):
mmap_lock: add tracepoints around lock acquisition
include/linux/mmap_lock.h | 94 +++++++++++++++-
include/trace/events/mmap_lock.h | 107 ++++++++++++++++++
mm/Makefile | 2 +-
mm/mmap_lock.c | 187 +++++++++++++++++++++++++++++++
4 files changed, 384 insertions(+), 6 deletions(-)
create mode 100644 include/trace/events/mmap_lock.h
create mode 100644 mm/mmap_lock.c
--
2.29.1.341.ge80a0c044ae-goog
Powered by blists - more mailing lists