linux-kernel - Re: [PATCH 1/1] mm/vmalloc: Combine all TLB flush operations of KASAN shadow virtual address into one operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZqjQp8NrTYM_ORN1@pc636>
Date: Tue, 30 Jul 2024 13:38:15 +0200
From: Uladzislau Rezki <urezki@...il.com>
To: Adrian Huang <adrianhuang0701@...il.com>
Cc: urezki@...il.com, ahuang12@...ovo.com, akpm@...ux-foundation.org,
	andreyknvl@...il.com, bhe@...hat.com, dvyukov@...gle.com,
	glider@...gle.com, hch@...radead.org, kasan-dev@...glegroups.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	ryabinin.a.a@...il.com, sunjw10@...ovo.com,
	vincenzo.frascino@....com
Subject: Re: [PATCH 1/1] mm/vmalloc: Combine all TLB flush operations of
 KASAN shadow virtual address into one operation

> On Mon, Jul 29, 2024 at 7:29 PM Uladzislau Rezki <urezki@...il.com> wrote:
> > It would be really good if Adrian could run the "compiling workload" on
> > his big system and post the statistics here.
> >
> > For example:
> >   a) v6.11-rc1 + KASAN.
> >   b) v6.11-rc1 + KASAN + patch.
> 
> Sure, please see the statistics below.
> 
> Test Result (based on 6.11-rc1)
> ===============================
> 
> 1. Profile purge_vmap_node()
> 
>    A. Command: trace-cmd record -p function_graph -l purge_vmap_node make -j $(nproc)
> 
>    B. Average execution time of purge_vmap_node():
> 
> 	no patch (us)		patched (us)	saved
> 	-------------		------------    -----
>       	 147885.02	 	  3692.51	 97%  
> 
>    C. Total execution time of purge_vmap_node():
> 
> 	no patch (us)		patched (us)	saved
> 	-------------		------------	-----
> 	  194173036		  5114138	 97%
> 
>    [ftrace log] Without patch: https://gist.github.com/AdrianHuang/a5bec861f67434e1024bbf43cea85959
>    [ftrace log] With patch: https://gist.github.com/AdrianHuang/a200215955ee377288377425dbaa04e3
> 
> 2. Use `time` utility to measure execution time
>  
>    A. Command: make clean && time make -j $(nproc)
> 
>    B. The following result is the average kernel execution time of five-time
>       measurements. ('sys' field of `time` output):
> 
> 	no patch (seconds)	patched (seconds)	saved
> 	------------------	----------------	-----
> 	    36932.904		   31403.478		 15%
> 
>    [`time` log] Without patch: https://gist.github.com/AdrianHuang/987b20fd0bd2bb616b3524aa6ee43112
>    [`time` log] With patch: https://gist.github.com/AdrianHuang/da2ea4e6aa0b4dcc207b4e40b202f694
>
I meant another statistics. As noted here https://lore.kernel.org/linux-mm/ZogS_04dP5LlRlXN@pc636/T/#m5d57f11d9f69aef5313f4efbe25415b3bae4c818
i came to conclusion that below place and lock:

<snip>
static void exit_notify(struct task_struct *tsk, int group_dead)
{
	bool autoreap;
	struct task_struct *p, *n;
	LIST_HEAD(dead);

	write_lock_irq(&tasklist_lock);
...
<snip>

keeps IRQs disabled, so it means that the purge_vmap_node() does the progress
but it can be slow.

CPU_1:
disables IRQs
trying to grab the tasklist_lock

CPU_2:
Sends an IPI to CPU_1
waits until the specified callback is executed on CPU_1

Since CPU_1 has disabled IRQs, serving an IPI and completion of callback
takes time until CPU_1 enables IRQs back.

Could you please post lock statistics for kernel compiling use case?
KASAN + patch is enough, IMO. This just to double check whether a
tasklist_lock is a problem or not.

Thanks!

--
Uladzislau Rezki