lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231123143452.erzar3sqhg37hjxz@revolver>
Date:   Thu, 23 Nov 2023 09:34:52 -0500
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     Bagas Sanjaya <bagasdotme@...il.com>
Cc:     Chun Ng <chunn@...dia.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Ankita Garg <ankitag@...dia.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>
Subject: Re: [REGRESSION]: mmap performance regression starting with k-6.1

* Bagas Sanjaya <bagasdotme@...il.com> [231123 00:07]:
> On Wed, Nov 22, 2023 at 08:03:19PM +0000, Chun Ng wrote:
> > Hi,
> > 
> > Recently I observed there is performance regression on system call mmap(..). I tried both vanilla kernels and Raspberry Pi kernels on a Raspberry Pi 4 box and the results are pretty consistent among them.
> > 
> > Bisection showed that the regression starts from k-6.1, and the latest vanilla k-6.7 is still showing the same regression.

This is almost certainly the maple tree.  The tree is slower on writes
than the rbtree and so if the benchmark mmaps/munmaps in a tight loop
you will see this slow down.  What you are doing is measuring the speed
of inserting and removing a VMA with this benchmark, so it's not really
something that happens - we usually use the mapping between adding and
removing it.

What this gains us is the ability to remove contention on the mmap lock
during page faults.  If you were to test contention around that lock,
you will see a slowdown until you reach v6.4, where per-vma locking
started to show up.  More benchmarking will show different types of
fault handling outside of the mmap lock until (I believe) 6.6, where
most (or all?) types are supported.

Although this is expected, I am still looking to reduce any real
workloads that may suffer.  I've been reducing the allocations, for
example.

> > 
> > The test program calls mmap/munmap for a 4K page with MAP_ANON and MAP_PRIVATE flags, and ftrace is used to measure the time spent on the do_mmap(..) call.  Measured time of a sample run with different vanilla kernel versions are:
> > k-5.10 and k-6.0: ~157us
> > k-6.1: ~194us
> > k-6.7: ~214us

I would have expected v6.7 to remain closer to v6.1, but that may depend
on the minor versions you have been testing and what fixes have landed
there.


> > Results are pretty consistent across multiple runs with a small percentage variance.  Ftrace shows that latency of mmap_region(...) has increased since k-6.1.  An application that makes frequent mmap(..) calls the accumulated extra latency is very noticeable. 
> > 
> > Please find the ftrace results and kernel config files in this folder:
> > https://drive.google.com/drive/folders/1qy8YTBqxu8Gdbs7IigYbSd4FXldId5sd?usp=drive_link
> > 
> > The test program can be found in here:
> > https://drive.google.com/file/d/1tG6_BbQMCHwfKebvAIAg_xqbM_lpPcuM/view?usp=sharing
> > 
> > Info on the testing environment:
> > cpufreq_governor: performance
> > Test machine: Raspberry Pi 4, 8GB DDR
> > SCHED_FIFO with priority 99 for running the test program
> > 
> > Vanilla kernels are not tainted. However on k-6.0 and k-6.7, I have to patch the drivers/clk/bcm/clk-raspberrypi.c file with the version in Raspberry Pi kernel tree for the CPU frequency governor to work.
> > 
> 
> The next step is to find the commit that introduces your regression with
> `git bisect`. If you haven't done so, see
> Documentation/admin-guide/bug-bisect.rst for instructions.
> 
> Anyway, I'm adding this regression to regzbot:
> 
> #regzbot ^introduced: v6.0..v6.1
> 
> Thanks.
> 
> -- 
> An old man doll... just what I always wanted! - Clara


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ