linux-kernel - Re: [PATCH] locking/lockdep: Disable KASAN instrumentation of lockdep.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c263295e-0345-40a2-932e-b83c7728941c@redhat.com>
Date: Mon, 3 Feb 2025 09:11:25 -0500
From: Waiman Long <llong@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>, Waiman Long <llong@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Will Deacon <will.deacon@....com>,
 Boqun Feng <boqun.feng@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] locking/lockdep: Disable KASAN instrumentation of
 lockdep.c

On 2/3/25 6:24 AM, Peter Zijlstra wrote:
> On Fri, Jan 31, 2025 at 04:47:06PM -0500, Waiman Long wrote:
>> On 1/31/25 11:50 AM, Waiman Long wrote:
>>> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
>>> Each of them can significantly slow down the speed of a debug kernel.
>>> Enabling KASAN instrumentation of the LOCKDEP code will further slow
>>> thing down.
>>>
>>> Since LOCKDEP is a high overhead debugging tool, it will never get
>>> enabled in a production kernel. The LOCKDEP code is also pretty mature
>>> and is unlikely to get major changes. There is also a possibility of
>>> recursion similar to KCSAN. As the small advantage of enabling KASAN
>>> instrumentation to catch potential memory access error is probably
>>> not worth the drawback of further slowing down a debug kernel, disable
>>> KASAN instrumentation to enable a debug kernel to gain a little bit of
>>> speed back.
>>>
>>> With a debug kernel with both LOCKDEP and KASAN enabled running on a
>>> 2-socket 144-thread system, the time to do a "make -j144" kernel build
>>> was 18m40.641s. After applying this patch, the parallel kernel build
>>> time was reduced to 17m35.136s. This is a reduction of about 66s (5.8%).
>>>
>>> Signed-off-by: Waiman Long <longman@...hat.com>
>>> ---
>>>    kernel/locking/Makefile | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
>>> index 0db4093d17b8..8a588b0227b1 100644
>>> --- a/kernel/locking/Makefile
>>> +++ b/kernel/locking/Makefile
>>> @@ -6,6 +6,7 @@ KCOV_INSTRUMENT		:= n
>>>    obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>>>    # Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
>>> +KASAN_SANITIZE_lockdep.o := n
>>>    KCSAN_SANITIZE_lockdep.o := n
>>>    ifdef CONFIG_FUNCTION_TRACER
>> The rationale behind this patch is due to the fact that a similar configured
>> PREEMPT_RT debug kernel is found to be about 3 times slower than the non-RT
>> debug kernel. For the test same system, the parallel build runtime is
>> 59m56.722s. After applying this patch, it is reduced to 38m3.348s. Its more
>> than 1/3 reduction is more than I would have expected. So the lockdep code
>> is much more heavily used in a PREEMPT_RT debug kernel.
> Perhaps put that in the changelog instead?
>
> Its not like RT is this secret out of tree project :-)
>
> Also, any quick clues as to what causes the extra lockdep overhead?
> Initially I thought perhaps local-lock, but that should also cause
> lockdep on !RT builds.

Yes, I am planning to update the patch with more RT debug kernel 
performance data.

As to why, my guess is that the average nesting depth will be higher 
because spin_lock_irq* no longer disable IRQ and there is an extra wait 
lock underneath the rt-mutex. Also the increase in the number of 
sleep-wake cycles because of the sleeping lock nature of rt-spinlock may 
be a contributing factor.

Cheers,
Longman

Cheers,
Longman