lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZWNMZr9QMSBDc0gd@casper.infradead.org>
Date:   Sun, 26 Nov 2023 13:47:18 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     David Wang <00107082@....com>
Cc:     liam.howlett@...cle.com, akpm@...ux-foundation.org,
        ankitag@...dia.com, bagasdotme@...il.com, chunn@...dia.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        regressions@...ts.linux.dev, surenb@...gle.com
Subject: Re: [REGRESSION]: mmap performance regression starting with k-6.1

On Sun, Nov 26, 2023 at 03:18:54PM +0800, David Wang wrote:
> I add memory access between mmap and munmap to the simple stress, and timeit.

It's still not a very good benchmark ...

> My test code now is:
> 
> 	#define MAXN 1024
> 	struct { void* addr; size_t n; } maps[MAXN];
> 	void accessit(char *addr, size_t n) {
> 		for (int i=0; i<n; i+=128) addr[i]=i;
> 	}
> 	int main() {
> 		int i, n, k, r;
> 		void *p;
> 		for (i=0; i<MAXN; i++) {
> 			n = 1024*((rand()%32)+1);
> 			p = mmap(NULL, n, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

So 'n' is now a number between 1kB and 32kB.  That's not terribly
realistic; I'd say you want to be more like

			n = 4096 * ((rand() % 512) + 1));

> 		for (i=0; i<10000000; i++) {
> 			k = rand()%MAXN;
> 	#ifdef PAGE_FAULT
> 			accessit((char*)maps[k].addr, maps[k].n);
> 	#endif
> 			r = munmap(maps[k].addr, maps[k].n);
> 			if (r) {
> 				perror("fail to munmap");
> 				return -1;
> 			}
> 			n = 1024*((rand()%32)+1);
> 			p = mmap(NULL, n, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

Are you simulating something a real application actually does?
Because this all seems very weird and micro-benchmark to me.  The real
applications we've benchmarked see a speedup so I'm not thrilled about
chasing down something that no real application does.

In terms of what's going on in the kernel, for each loop, you're calling
munmap(), taking between 1 and 8 page faults, then calling mmap().
That may just be too few page faults to see the benefit of the maple tree.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ