[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1538157201-29173-1-git-send-email-longman@redhat.com>
Date: Fri, 28 Sep 2018 13:53:16 -0400
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Will Deacon <will.deacon@....com>
Cc: linux-kernel@...r.kernel.org, Waiman Long <longman@...hat.com>
Subject: [PATCH 0/5] locking/lockdep: Improve lockdep performance
Enabling CONFIG_LOCKDEP and other related debug options will greatly
reduce system performance. This patchset aims to reduce the performance
slowdown caused by the lockdep code.
Patch 1 just removes an inline function that wasn't used.
Patches 2 and 3 are minor twists to optimize the code.
Patch 4 makes class->ops a per-cpu counter.
Patch 5 moves the lock_release() call outside of a lock critical section.
Parallel kernel compilation tests (make -j <#cpu>) were performed on
2 different systems:
1) an 1-socket 22-core 44-thread Skylake system
2) a 4-socket 72-core 144-thread Broadwell system
The build times with pre-patch and post-patch debug kernels were:
System Pre-patch Post-patch %Change
------ --------- ---------- -------
1-socket 8m53.9s 8m41.2s -2.4%
4-socket 7m27.0s 5m31.0s -26%
I think it is the last 2 patches that yield most of the performance
improvement.
Waiman Long (5):
locking/lockdep: Remove add_chain_cache_classes()
locking/lockdep: Eliminate redundant irqs check in __lock_acquire()
locking/lockdep: Add a faster path in __lock_release()
locking/lockdep: Make class->ops a percpu counter
locking/lockdep: Call lock_release after releasing the lock
include/linux/lockdep.h | 2 +-
include/linux/rwlock_api_smp.h | 16 +++---
include/linux/spinlock_api_smp.h | 8 +--
kernel/locking/lockdep.c | 120 ++++++++++++---------------------------
4 files changed, 48 insertions(+), 98 deletions(-)
--
1.8.3.1
Powered by blists - more mailing lists