lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z6w4UlCQa_g1OHlN@Mac.home>
Date: Tue, 11 Feb 2025 21:57:38 -0800
From: Boqun Feng <boqun.feng@...il.com>
To: Waiman Long <longman@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
	Will Deacon <will.deacon@....com>, linux-kernel@...r.kernel.org,
	Andrey Ryabinin <ryabinin.a.a@...il.com>,
	Alexander Potapenko <glider@...gle.com>,
	Andrey Konovalov <andreyknvl@...il.com>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Vincenzo Frascino <vincenzo.frascino@....com>,
	kasan-dev@...glegroups.com
Subject: Re: [PATCH v3 3/3] locking/lockdep: Disable KASAN instrumentation of
 lockdep.c

[Cc KASAN]

A Reviewed-by or Acked-by from KASAN would be nice, thanks!

Regards,
Boqun

On Sun, Feb 09, 2025 at 11:26:12PM -0500, Waiman Long wrote:
> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
> Each of them can significantly slow down the speed of a debug kernel.
> Enabling KASAN instrumentation of the LOCKDEP code will further slow
> thing down.
> 
> Since LOCKDEP is a high overhead debugging tool, it will never get
> enabled in a production kernel. The LOCKDEP code is also pretty mature
> and is unlikely to get major changes. There is also a possibility of
> recursion similar to KCSAN.
> 
> To evaluate the performance impact of disabling KASAN instrumentation
> of lockdep.c, the time to do a parallel build of the Linux defconfig
> kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
> and an arm64 system were used as test beds. Two sets of non-RT and RT
> kernels with similar configurations except mainly CONFIG_PREEMPT_RT
> were used for evaulation.
> 
> For the Skylake system:
> 
>   Kernel			Run time	    Sys time
>   ------			--------	    --------
>   Non-debug kernel (baseline)	0m47.642s	      4m19.811s
>   Debug kernel			2m11.108s (x2.8)     38m20.467s (x8.9)
>   Debug kernel (patched)	1m49.602s (x2.3)     31m28.501s (x7.3)
>   Debug kernel
>   (patched + mitigations=off) 	1m30.988s (x1.9)     26m41.993s (x6.2)
> 
>   RT kernel (baseline)		0m54.871s	      7m15.340s
>   RT debug kernel		6m07.151s (x6.7)    135m47.428s (x18.7)
>   RT debug kernel (patched)	3m42.434s (x4.1)     74m51.636s (x10.3)
>   RT debug kernel
>   (patched + mitigations=off) 	2m40.383s (x2.9)     57m54.369s (x8.0)
> 
> For the Zen 2 system:
> 
>   Kernel			Run time	    Sys time
>   ------			--------	    --------
>   Non-debug kernel (baseline)	1m42.806s	     39m48.714s
>   Debug kernel			4m04.524s (x2.4)    125m35.904s (x3.2)
>   Debug kernel (patched)	3m56.241s (x2.3)    127m22.378s (x3.2)
>   Debug kernel
>   (patched + mitigations=off) 	2m38.157s (x1.5)     92m35.680s (x2.3)
> 
>   RT kernel (baseline)		 1m51.500s	     14m56.322s
>   RT debug kernel		16m04.962s (x8.7)   244m36.463s (x16.4)
>   RT debug kernel (patched)	 9m09.073s (x4.9)   129m28.439s (x8.7)
>   RT debug kernel
>   (patched + mitigations=off) 	 3m31.662s (x1.9)    51m01.391s (x3.4)
> 
> For the arm64 system:
> 
>   Kernel			Run time	    Sys time
>   ------			--------	    --------
>   Non-debug kernel (baseline)	1m56.844s	      8m47.150s
>   Debug kernel			3m54.774s (x2.0)     92m30.098s (x10.5)
>   Debug kernel (patched)	3m32.429s (x1.8)     77m40.779s (x8.8)
> 
>   RT kernel (baseline)		 4m01.641s	     18m16.777s
>   RT debug kernel		19m32.977s (x4.9)   304m23.965s (x16.7)
>   RT debug kernel (patched)	16m28.354s (x4.1)   234m18.149s (x12.8)
> 
> Turning the mitigations off doesn't seems to have any noticeable impact
> on the performance of the arm64 system. So the mitigation=off entries
> aren't included.
> 
> For the x86 CPUs, cpu mitigations has a much bigger impact on
> performance, especially the RT debug kernel. The SRSO mitigation in
> Zen 2 has an especially big impact on the debug kernel. It is also the
> majority of the slowdown with mitigations on. It is because the patched
> ret instruction slows down function returns. A lot of helper functions
> that are normally compiled out or inlined may become real function
> calls in the debug kernel. The KASAN instrumentation inserts a lot
> of __asan_loadX*() and __kasan_check_read() function calls to memory
> access portion of the code. The lockdep's __lock_acquire() function,
> for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
> added with KASAN instrumentation. Of course, the actual numbers may vary
> depending on the compiler used and the exact version of the lockdep code.
> 
> With the newly added rtmutex and lockdep lock events, the relevant
> event counts for the test runs with the Skylake system were:
> 
>   Event type		Debug kernel	RT debug kernel
>   ----------		------------	---------------
>   lockdep_acquire	1,968,663,277	5,425,313,953
>   rtlock_slowlock	     -		  401,701,156
>   rtmutex_slowlock	     -		      139,672
> 
> The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
> non-RT debug kernel with the same workload. Since the __lock_acquire()
> function is a big hitter in term of performance slowdown, this makes
> the RT debug kernel much slower than the non-RT one. The average lock
> nesting depth is likely to be higher in the RT debug kernel too leading
> to longer execution time in the __lock_acquire() function.
> 
> As the small advantage of enabling KASAN instrumentation to catch
> potential memory access error in the lockdep debugging tool is probably
> not worth the drawback of further slowing down a debug kernel, disable
> KASAN instrumentation in the lockdep code to allow the debug kernels
> to regain some performance back, especially for the RT debug kernels.
> 
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
>  kernel/locking/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index 0db4093d17b8..a114949eeed5 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -5,7 +5,8 @@ KCOV_INSTRUMENT		:= n
>  
>  obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>  
> -# Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
> +# Avoid recursion lockdep -> sanitizer -> ... -> lockdep & improve performance.
> +KASAN_SANITIZE_lockdep.o := n
>  KCSAN_SANITIZE_lockdep.o := n
>  
>  ifdef CONFIG_FUNCTION_TRACER
> -- 
> 2.48.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ