lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 5 Jun 2023 11:28:35 +0800
From:   Peng Zhang <zhangpeng.00@...edance.com>
To:     "Yin, Fengwei" <fengwei.yin@...el.com>
Cc:     maple-tree@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, "Liu, Yujie" <yujie.liu@...el.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 00/14] Reduce preallocations for maple tree



在 2023/6/2 16:10, Yin, Fengwei 写道:
> Hi Liam,
> 
> On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
>> Initial work on preallocations showed no regression in performance
>> during testing, but recently some users (both on [1] and off [android]
>> list) have reported that preallocating the worst-case number of nodes
>> has caused some slow down.  This patch set addresses the number of
>> allocations in a few ways.
>>
>> During munmap() most munmap() operations will remove a single VMA, so
>> leverage the fact that the maple tree can place a single pointer at
>> range 0 - 0 without allocating.  This is done by changing the index in
>> the 'sidetree'.
>>
>> Re-introduce the entry argument to mas_preallocate() so that a more
>> intelligent guess of the node count can be made.
>>
>> Patches are in the following order:
>> 0001-0002: Testing framework for benchmarking some operations
>> 0003-0004: Reduction of maple node allocation in sidetree
>> 0005:      Small cleanup of do_vmi_align_munmap()
>> 0006-0013: mas_preallocate() calculation change
>> 0014:      Change the vma iterator order
> I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
> this patchset.
> 
> The result has a little bit improvement:
> Base (next-20230602):
>    503880
> Base with this patchset:
>    519501
> 
> But they are far from the none-regression result (commit 7be1c1a3c7b1):
>    718080
> 
> 
> Some other information I collected:
> With Base, the mas_alloc_nodes are always hit with request: 7.
> With this patchset, the request are 1 or 5.
> 
> I suppose this is the reason for improvement from 503880 to 519501.
> 
> With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
> mas_alloc_nodes() call. Thanks.
Hi Fengwei,

I think it may be related to the inaccurate number of nodes allocated
in the pre-allocation. I slightly modified the pre-allocation in this
patchset, but I don't know if it works. It would be great if you could
help test it, and help pinpoint the cause. Below is the diff, which can
be applied based on this pachset.

Thanks,
Peng

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 5ea211c3f186..e67bf2744384 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5575,9 +5575,11 @@ int mas_preallocate(struct ma_state *mas, void 
*entry, gfp_t gfp)
  		goto ask_now;
  	}

-	/* New root needs a singe node */
-	if (unlikely(mte_is_root(mas->node)))
-		goto ask_now;
+	if ((node_size == wr_mas.node_end + 1 &&
+	     mas->offset == wr_mas.node_end) ||
+	    (node_size == wr_mas.node_end &&
+	     wr_mas.offset_end - mas->offset == 1))
+		return 0;

  	/* Potential spanning rebalance collapsing a node, use worst-case */
  	if (node_size  - 1 <= mt_min_slots[wr_mas.type])
@@ -5590,7 +5592,6 @@ int mas_preallocate(struct ma_state *mas, void 
*entry, gfp_t gfp)
  	if (likely(!mas_is_err(mas)))
  		return 0;

-	mas_set_alloc_req(mas, 0);
  	ret = xa_err(mas->node);
  	mas_reset(mas);
  	mas_destroy(mas);


> 
> 
> Regards
> Yin, Fengwei
> 
>>
>> [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@intel.com/
>>
>> Liam R. Howlett (14):
>>    maple_tree: Add benchmarking for mas_for_each
>>    maple_tree: Add benchmarking for mas_prev()
>>    mm: Move unmap_vmas() declaration to internal header
>>    mm: Change do_vmi_align_munmap() side tree index
>>    mm: Remove prev check from do_vmi_align_munmap()
>>    maple_tree: Introduce __mas_set_range()
>>    mm: Remove re-walk from mmap_region()
>>    maple_tree: Re-introduce entry to mas_preallocate() arguments
>>    mm: Use vma_iter_clear_gfp() in nommu
>>    mm: Set up vma iterator for vma_iter_prealloc() calls
>>    maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
>>    maple_tree: Update mas_preallocate() testing
>>    maple_tree: Refine mas_preallocate() node calculations
>>    mm/mmap: Change vma iteration order in do_vmi_align_munmap()
>>
>>   fs/exec.c                        |   1 +
>>   include/linux/maple_tree.h       |  23 ++++-
>>   include/linux/mm.h               |   4 -
>>   lib/maple_tree.c                 |  78 ++++++++++----
>>   lib/test_maple_tree.c            |  74 +++++++++++++
>>   mm/internal.h                    |  40 ++++++--
>>   mm/memory.c                      |  16 ++-
>>   mm/mmap.c                        | 171 ++++++++++++++++---------------
>>   mm/nommu.c                       |  45 ++++----
>>   tools/testing/radix-tree/maple.c |  59 ++++++-----
>>   10 files changed, 331 insertions(+), 180 deletions(-)
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ