linux-hardening - Re: [PATCH v10 0/5] Introduce mseal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <znrbeb744774vre5dkeg7kjnnt7uuifs6xw63udcyupwj3veqh@rpcqs7dmoxi6>
Date: Tue, 16 Apr 2024 11:13:05 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: jeffxu@...omium.org
Cc: akpm@...ux-foundation.org, keescook@...omium.org, jannh@...gle.com,
        sroettger@...gle.com, willy@...radead.org, gregkh@...uxfoundation.org,
        torvalds@...ux-foundation.org, usama.anjum@...labora.com,
        corbet@....net, surenb@...gle.com, merimus@...gle.com,
        rdunlap@...radead.org, jeffxu@...gle.com, jorgelo@...omium.org,
        groeck@...omium.org, linux-kernel@...r.kernel.org,
        linux-kselftest@...r.kernel.org, linux-mm@...ck.org,
        pedro.falcato@...il.com, dave.hansen@...el.com,
        linux-hardening@...r.kernel.org, deraadt@...nbsd.org
Subject: Re: [PATCH v10 0/5] Introduce mseal

* jeffxu@...omium.org <jeffxu@...omium.org> [240415 12:35]:
> From: Jeff Xu <jeffxu@...omium.org>
> 
> This is V10 version, it rebases v9 patch to 6.9.rc3.
> We also applied and tested mseal() in chrome and chromebook.
> 
> ------------------------------------------------------------------
...

> MM perf benchmarks
> ==================
> This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to
> check the VMAs’ sealing flag, so that no partial update can be made,
> when any segment within the given memory range is sealed.
> 
> To measure the performance impact of this loop, two tests are developed.
> [8]
> 
> The first is measuring the time taken for a particular system call,
> by using clock_gettime(CLOCK_MONOTONIC). The second is using
> PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have
> similar results.
> 
> The tests have roughly below sequence:
> for (i = 0; i < 1000, i++)
>     create 1000 mappings (1 page per VMA)
>     start the sampling
>     for (j = 0; j < 1000, j++)
>         mprotect one mapping
>     stop and save the sample
>     delete 1000 mappings
> calculates all samples.


Thank you for doing this performance testing.

> 
> Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz,
> 4G memory, Chromebook.
> 
> Based on the latest upstream code:
> The first test (measuring time)
> syscall__	vmas	t	t_mseal	delta_ns	per_vma	%
> munmap__  	1	909	944	35	35	104%
> munmap__  	2	1398	1502	104	52	107%
> munmap__  	4	2444	2594	149	37	106%
> munmap__  	8	4029	4323	293	37	107%
> munmap__  	16	6647	6935	288	18	104%
> munmap__  	32	11811	12398	587	18	105%
> mprotect	1	439	465	26	26	106%
> mprotect	2	1659	1745	86	43	105%
> mprotect	4	3747	3889	142	36	104%
> mprotect	8	6755	6969	215	27	103%
> mprotect	16	13748	14144	396	25	103%
> mprotect	32	27827	28969	1142	36	104%
> madvise_	1	240	262	22	22	109%
> madvise_	2	366	442	76	38	121%
> madvise_	4	623	751	128	32	121%
> madvise_	8	1110	1324	215	27	119%
> madvise_	16	2127	2451	324	20	115%
> madvise_	32	4109	4642	534	17	113%
> 
> The second test (measuring cpu cycle)
> syscall__	vmas	cpu	cmseal	delta_cpu	per_vma	%
> munmap__	1	1790	1890	100	100	106%
> munmap__	2	2819	3033	214	107	108%
> munmap__	4	4959	5271	312	78	106%
> munmap__	8	8262	8745	483	60	106%
> munmap__	16	13099	14116	1017	64	108%
> munmap__	32	23221	24785	1565	49	107%
> mprotect	1	906	967	62	62	107%
> mprotect	2	3019	3203	184	92	106%
> mprotect	4	6149	6569	420	105	107%
> mprotect	8	9978	10524	545	68	105%
> mprotect	16	20448	21427	979	61	105%
> mprotect	32	40972	42935	1963	61	105%
> madvise_	1	434	497	63	63	115%
> madvise_	2	752	899	147	74	120%
> madvise_	4	1313	1513	200	50	115%
> madvise_	8	2271	2627	356	44	116%
> madvise_	16	4312	4883	571	36	113%
> madvise_	32	8376	9319	943	29	111%
> 

If I am reading this right, madvise() is affected more than the other
calls?  Is that expected or do we need to have a closer look?

...

> When I discuss the mm performance with Brian Makin, an engineer worked
> on performance, it was brought to my attention that such a performance
> benchmarks, which measuring millions of mm syscall in a tight loop, may
> not accurately reflect real-world scenarios, such as that of a database
> service. Also this is tested using a single HW and ChromeOS, the data
> from another HW or distribution might be different. It might be best
> to take this data with a grain of salt.
> 

Absolutely, these types of benchmarks are pointless to simulate what
will really happen with any sane program.

However, they are valuable in that they can highlight areas where
something may have been made more inefficient.  These inefficiencies
would otherwise be lost in the noise of regular system use.  They can be
used as a relatively high level sanity on what you believe is going on.

I appreciate you doing the work on testing the performance here.

...