linux-kernel - Re: [PATCH v3 3/3] locking/lockdep: Disable KASAN instrumentation of lockdep.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6993bbd-ec8a-40e1-9ef2-74f920642188@redhat.com>
Date: Wed, 12 Feb 2025 11:57:28 -0500
From: Waiman Long <llong@...hat.com>
To: Marco Elver <elver@...gle.com>, Boqun Feng <boqun.feng@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will.deacon@....com>, linux-kernel@...r.kernel.org,
 Andrey Ryabinin <ryabinin.a.a@...il.com>,
 Alexander Potapenko <glider@...gle.com>,
 Andrey Konovalov <andreyknvl@...il.com>, Dmitry Vyukov <dvyukov@...gle.com>,
 Vincenzo Frascino <vincenzo.frascino@....com>, kasan-dev@...glegroups.com
Subject: Re: [PATCH v3 3/3] locking/lockdep: Disable KASAN instrumentation of
 lockdep.c

On 2/12/25 6:30 AM, Marco Elver wrote:
> On Wed, 12 Feb 2025 at 06:57, Boqun Feng <boqun.feng@...il.com> wrote:
>> [Cc KASAN]
>>
>> A Reviewed-by or Acked-by from KASAN would be nice, thanks!
>>
>> Regards,
>> Boqun
>>
>> On Sun, Feb 09, 2025 at 11:26:12PM -0500, Waiman Long wrote:
>>> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
>>> Each of them can significantly slow down the speed of a debug kernel.
>>> Enabling KASAN instrumentation of the LOCKDEP code will further slow
>>> thing down.
>>>
>>> Since LOCKDEP is a high overhead debugging tool, it will never get
>>> enabled in a production kernel. The LOCKDEP code is also pretty mature
>>> and is unlikely to get major changes. There is also a possibility of
>>> recursion similar to KCSAN.
>>>
>>> To evaluate the performance impact of disabling KASAN instrumentation
>>> of lockdep.c, the time to do a parallel build of the Linux defconfig
>>> kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
>>> and an arm64 system were used as test beds. Two sets of non-RT and RT
>>> kernels with similar configurations except mainly CONFIG_PREEMPT_RT
>>> were used for evaulation.
>>>
>>> For the Skylake system:
>>>
>>>    Kernel                      Run time            Sys time
>>>    ------                      --------            --------
>>>    Non-debug kernel (baseline) 0m47.642s             4m19.811s
>>>    Debug kernel                        2m11.108s (x2.8)     38m20.467s (x8.9)
>>>    Debug kernel (patched)      1m49.602s (x2.3)     31m28.501s (x7.3)
>>>    Debug kernel
>>>    (patched + mitigations=off)         1m30.988s (x1.9)     26m41.993s (x6.2)
>>>
>>>    RT kernel (baseline)                0m54.871s             7m15.340s
>>>    RT debug kernel             6m07.151s (x6.7)    135m47.428s (x18.7)
>>>    RT debug kernel (patched)   3m42.434s (x4.1)     74m51.636s (x10.3)
>>>    RT debug kernel
>>>    (patched + mitigations=off)         2m40.383s (x2.9)     57m54.369s (x8.0)
>>>
>>> For the Zen 2 system:
>>>
>>>    Kernel                      Run time            Sys time
>>>    ------                      --------            --------
>>>    Non-debug kernel (baseline) 1m42.806s            39m48.714s
>>>    Debug kernel                        4m04.524s (x2.4)    125m35.904s (x3.2)
>>>    Debug kernel (patched)      3m56.241s (x2.3)    127m22.378s (x3.2)
>>>    Debug kernel
>>>    (patched + mitigations=off)         2m38.157s (x1.5)     92m35.680s (x2.3)
>>>
>>>    RT kernel (baseline)                 1m51.500s           14m56.322s
>>>    RT debug kernel             16m04.962s (x8.7)   244m36.463s (x16.4)
>>>    RT debug kernel (patched)    9m09.073s (x4.9)   129m28.439s (x8.7)
>>>    RT debug kernel
>>>    (patched + mitigations=off)          3m31.662s (x1.9)    51m01.391s (x3.4)
>>>
>>> For the arm64 system:
>>>
>>>    Kernel                      Run time            Sys time
>>>    ------                      --------            --------
>>>    Non-debug kernel (baseline) 1m56.844s             8m47.150s
>>>    Debug kernel                        3m54.774s (x2.0)     92m30.098s (x10.5)
>>>    Debug kernel (patched)      3m32.429s (x1.8)     77m40.779s (x8.8)
>>>
>>>    RT kernel (baseline)                 4m01.641s           18m16.777s
>>>    RT debug kernel             19m32.977s (x4.9)   304m23.965s (x16.7)
>>>    RT debug kernel (patched)   16m28.354s (x4.1)   234m18.149s (x12.8)
>>>
>>> Turning the mitigations off doesn't seems to have any noticeable impact
>>> on the performance of the arm64 system. So the mitigation=off entries
>>> aren't included.
>>>
>>> For the x86 CPUs, cpu mitigations has a much bigger impact on
>>> performance, especially the RT debug kernel. The SRSO mitigation in
>>> Zen 2 has an especially big impact on the debug kernel. It is also the
>>> majority of the slowdown with mitigations on. It is because the patched
>>> ret instruction slows down function returns. A lot of helper functions
>>> that are normally compiled out or inlined may become real function
>>> calls in the debug kernel. The KASAN instrumentation inserts a lot
>>> of __asan_loadX*() and __kasan_check_read() function calls to memory
>>> access portion of the code. The lockdep's __lock_acquire() function,
>>> for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
>>> added with KASAN instrumentation. Of course, the actual numbers may vary
>>> depending on the compiler used and the exact version of the lockdep code.
> For completeness-sake, we'd also have to compare with
> CONFIG_KASAN_INLINE=y, which gets rid of the __asan_ calls (not the
> explicit __kasan_ checks). But I leave it up to you - I'm aware it
> results in slow-downs, too. ;-)
I just realize that my config file for non-RT debug kernel does have 
CONFIG_KASAN_INLINE=y set, though the RT debug kernel does not have 
this. For the non-RT debug kernel, the _asan_report_load* functions are 
still being called because lockdep.c is very big (> 6k lines of code). 
So "call_threshold := 10000" in scripts/Makefile.kasan is probably not 
enough for lockdep.c.
>
>>> With the newly added rtmutex and lockdep lock events, the relevant
>>> event counts for the test runs with the Skylake system were:
>>>
>>>    Event type          Debug kernel    RT debug kernel
>>>    ----------          ------------    ---------------
>>>    lockdep_acquire     1,968,663,277   5,425,313,953
>>>    rtlock_slowlock          -            401,701,156
>>>    rtmutex_slowlock         -                139,672
>>>
>>> The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
>>> non-RT debug kernel with the same workload. Since the __lock_acquire()
>>> function is a big hitter in term of performance slowdown, this makes
>>> the RT debug kernel much slower than the non-RT one. The average lock
>>> nesting depth is likely to be higher in the RT debug kernel too leading
>>> to longer execution time in the __lock_acquire() function.
>>>
>>> As the small advantage of enabling KASAN instrumentation to catch
>>> potential memory access error in the lockdep debugging tool is probably
>>> not worth the drawback of further slowing down a debug kernel, disable
>>> KASAN instrumentation in the lockdep code to allow the debug kernels
>>> to regain some performance back, especially for the RT debug kernels.
> It's not about catching a bug in the lockdep code, but rather guard
> against bugs in code that allocated the storage for some
> synchronization object. Since lockdep state is embedded in each
> synchronization object, lockdep checking code may be passed a
> reference to garbage data, e.g. on use-after-free (or even
> out-of-bounds if there's an array of sync objects). In that case, all
> bets are off and lockdep may produce random false reports. Sure the
> system is already in a bad state at that point, but it's going to make
> debugging much harder.
>
> Our approach has always been to ensure that as soon as there's an
> error state detected it's reported as soon as we can, before it
> results in random failure as execution continues (e.g. bad lock
> reports).
>
> To guard against that, I would propose adding carefully placed
> kasan_check_byte() in lockdep code.

Will take a look at that.

Cheers,
Longman