linux-kernel - Re: [PATCH 00/14] Reduce preallocations for maple tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7a5dc9ce-b58f-e1b3-db1a-d00a8a556ae5@intel.com>
Date:   Fri, 2 Jun 2023 16:10:40 +0800
From:   "Yin, Fengwei" <fengwei.yin@...el.com>
To:     "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
CC:     <maple-tree@...ts.infradead.org>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>, "Liu, Yujie" <yujie.liu@...el.com>
Subject: Re: [PATCH 00/14] Reduce preallocations for maple tree

Hi Liam,

On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
> Initial work on preallocations showed no regression in performance
> during testing, but recently some users (both on [1] and off [android]
> list) have reported that preallocating the worst-case number of nodes
> has caused some slow down.  This patch set addresses the number of
> allocations in a few ways.
> 
> During munmap() most munmap() operations will remove a single VMA, so
> leverage the fact that the maple tree can place a single pointer at
> range 0 - 0 without allocating.  This is done by changing the index in
> the 'sidetree'.
> 
> Re-introduce the entry argument to mas_preallocate() so that a more
> intelligent guess of the node count can be made.
> 
> Patches are in the following order:
> 0001-0002: Testing framework for benchmarking some operations
> 0003-0004: Reduction of maple node allocation in sidetree
> 0005:      Small cleanup of do_vmi_align_munmap()
> 0006-0013: mas_preallocate() calculation change
> 0014:      Change the vma iterator order
I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
this patchset.

The result has a little bit improvement:
Base (next-20230602):
  503880
Base with this patchset:
  519501

But they are far from the none-regression result (commit 7be1c1a3c7b1):
  718080


Some other information I collected:
With Base, the mas_alloc_nodes are always hit with request: 7.
With this patchset, the request are 1 or 5.

I suppose this is the reason for improvement from 503880 to 519501.

With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
mas_alloc_nodes() call. Thanks.


Regards
Yin, Fengwei

> 
> [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@intel.com/
> 
> Liam R. Howlett (14):
>   maple_tree: Add benchmarking for mas_for_each
>   maple_tree: Add benchmarking for mas_prev()
>   mm: Move unmap_vmas() declaration to internal header
>   mm: Change do_vmi_align_munmap() side tree index
>   mm: Remove prev check from do_vmi_align_munmap()
>   maple_tree: Introduce __mas_set_range()
>   mm: Remove re-walk from mmap_region()
>   maple_tree: Re-introduce entry to mas_preallocate() arguments
>   mm: Use vma_iter_clear_gfp() in nommu
>   mm: Set up vma iterator for vma_iter_prealloc() calls
>   maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
>   maple_tree: Update mas_preallocate() testing
>   maple_tree: Refine mas_preallocate() node calculations
>   mm/mmap: Change vma iteration order in do_vmi_align_munmap()
> 
>  fs/exec.c                        |   1 +
>  include/linux/maple_tree.h       |  23 ++++-
>  include/linux/mm.h               |   4 -
>  lib/maple_tree.c                 |  78 ++++++++++----
>  lib/test_maple_tree.c            |  74 +++++++++++++
>  mm/internal.h                    |  40 ++++++--
>  mm/memory.c                      |  16 ++-
>  mm/mmap.c                        | 171 ++++++++++++++++---------------
>  mm/nommu.c                       |  45 ++++----
>  tools/testing/radix-tree/maple.c |  59 ++++++-----
>  10 files changed, 331 insertions(+), 180 deletions(-)
>