lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230605144438.rloukl3mccgkfxam@revolver>
Date:   Mon, 5 Jun 2023 10:44:38 -0400
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     Peng Zhang <zhangpeng.00@...edance.com>
Cc:     "Yin, Fengwei" <fengwei.yin@...el.com>,
        maple-tree@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, "Liu, Yujie" <yujie.liu@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 00/14] Reduce preallocations for maple tree

* Peng Zhang <zhangpeng.00@...edance.com> [230605 10:27]:
> 
> 
> 在 2023/6/5 22:03, Liam R. Howlett 写道:
> > * Peng Zhang <zhangpeng.00@...edance.com> [230605 03:59]:
> > > 
> > > 
> > > 在 2023/6/5 14:18, Yin, Fengwei 写道:
> > > > 
> > > > 
> > > > On 6/5/2023 12:41 PM, Yin Fengwei wrote:
> > > > > Hi Peng,
> > > > > 
> > > > > On 6/5/23 11:28, Peng Zhang wrote:
> > > > > > 
> > > > > > 
> > > > > > 在 2023/6/2 16:10, Yin, Fengwei 写道:
> > > > > > > Hi Liam,
> > > > > > > 
> > > > > > > On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
> > > > > > > > Initial work on preallocations showed no regression in performance
> > > > > > > > during testing, but recently some users (both on [1] and off [android]
> > > > > > > > list) have reported that preallocating the worst-case number of nodes
> > > > > > > > has caused some slow down.  This patch set addresses the number of
> > > > > > > > allocations in a few ways.
> > > > > > > > 
> > > > > > > > During munmap() most munmap() operations will remove a single VMA, so
> > > > > > > > leverage the fact that the maple tree can place a single pointer at
> > > > > > > > range 0 - 0 without allocating.  This is done by changing the index in
> > > > > > > > the 'sidetree'.
> > > > > > > > 
> > > > > > > > Re-introduce the entry argument to mas_preallocate() so that a more
> > > > > > > > intelligent guess of the node count can be made.
> > > > > > > > 
> > > > > > > > Patches are in the following order:
> > > > > > > > 0001-0002: Testing framework for benchmarking some operations
> > > > > > > > 0003-0004: Reduction of maple node allocation in sidetree
> > > > > > > > 0005:      Small cleanup of do_vmi_align_munmap()
> > > > > > > > 0006-0013: mas_preallocate() calculation change
> > > > > > > > 0014:      Change the vma iterator order
> > > > > > > I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
> > > > > > > this patchset.
> > > > > > > 
> > > > > > > The result has a little bit improvement:
> > > > > > > Base (next-20230602):
> > > > > > >      503880
> > > > > > > Base with this patchset:
> > > > > > >      519501
> > > > > > > 
> > > > > > > But they are far from the none-regression result (commit 7be1c1a3c7b1):
> > > > > > >      718080
> > > > > > > 
> > > > > > > 
> > > > > > > Some other information I collected:
> > > > > > > With Base, the mas_alloc_nodes are always hit with request: 7.
> > > > > > > With this patchset, the request are 1 or 5.
> > > > > > > 
> > > > > > > I suppose this is the reason for improvement from 503880 to 519501.
> > > > > > > 
> > > > > > > With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
> > > > > > > mas_alloc_nodes() call. Thanks.
> > > > > > Hi Fengwei,
> > > > > > 
> > > > > > I think it may be related to the inaccurate number of nodes allocated
> > > > > > in the pre-allocation. I slightly modified the pre-allocation in this
> > > > > > patchset, but I don't know if it works. It would be great if you could
> > > > > > help test it, and help pinpoint the cause. Below is the diff, which can
> > > > > > be applied based on this pachset.
> > > > > I tried the patch, it could eliminate the call of mas_alloc_nodes() during
> > > > > the test. But the result of benchmark got a little bit improvement:
> > > > >     529040
> > > > > 
> > > > > But it's still much less than none-regression result. I will also double
> > > > > confirm the none-regression result.
> > > > Just noticed that the commit f5715584af95 make validate_mm() two implementation
> > > > based on CONFIG_DEBUG_VM instead of CONFIG_DEBUG_VM_MAPPLE_TREE). I have
> > > > CONFIG_DEBUG_VM but not CONFIG_DEBUG_VM_MAPPLE_TREE defined. So it's not an
> > > > apple to apple.
> > 
> > You mean "mm: update validate_mm() to use vma iterator" here I guess.  I
> > have it as a different commit id in my branch.
> > 
> > I 'restored' some of the checking because I was able to work around not
> > having the mt_dump() definition with the vma iterator.  I'm now
> > wondering how wide spread CONFIG_DEBUG_VM is used and if I should not
> > have added these extra checks.
> > 
> > > > 
> > > > 
> > > > I disable CONFIG_DEBUG_VM and re-run the test and got:
> > > > Before preallocation change (7be1c1a3c7b1):
> > > >       770100
> > > > After preallocation change (28c5609fb236):
> > > >       680000
> > > > With liam's fix:
> > > >       702100
> > > > plus Peng's fix:
> > > >       725900
> > > Thank you for your test, now it seems that the performance
> > > regression is not so much.
> > 
> > We are also too strict on the reset during mas_store_prealloc() checking
> > for a spanning write.  I have a fix for this for v2 of the patch set,
> > although I suspect it will not make a huge difference.
> > 
> > > > 
> > > > 
> > > > Regards
> > > > Yin, Fengwei
> > > > 
> > > > > 
> > > > > 
> > > > > Regards
> > > > > Yin, Fengwei
> > > > > 
> > > > > > 
> > > > > > Thanks,
> > > > > > Peng
> > > > > > 
> > > > > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > > > > > index 5ea211c3f186..e67bf2744384 100644
> > > > > > --- a/lib/maple_tree.c
> > > > > > +++ b/lib/maple_tree.c
> > > > > > @@ -5575,9 +5575,11 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
> > > > > >            goto ask_now;
> > > > > >        }
> > > > > > 
> > > > > > -    /* New root needs a singe node */
> > > > > > -    if (unlikely(mte_is_root(mas->node)))
> > > > > > -        goto ask_now;
> > 
> > Why did you drop this?  If we are creating a new root we will only need
> > one node.
> The code below handles the root case perfectly,
> we don't need additional checks.
> 	if (node_size  - 1 <= mt_min_slots[wr_mas.type])
> 		request = mas_mt_height(mas) * 2 - 1;

Unless we have the minimum for a node, in which case we will fall
through and ask for one node.  This works and is rare enough so I'll
drop it.  Thanks.

> > 
> > > > > > +    if ((node_size == wr_mas.node_end + 1 &&
> > > > > > +         mas->offset == wr_mas.node_end) ||
> > > > > > +        (node_size == wr_mas.node_end &&
> > > > > > +         wr_mas.offset_end - mas->offset == 1))
> > > > > > +        return 0;
> > 
> > I will add this to v2 as well, or something similar.
> > 
> > > > > > 
> > > > > >        /* Potential spanning rebalance collapsing a node, use worst-case */
> > > > > >        if (node_size  - 1 <= mt_min_slots[wr_mas.type])
> > > > > > @@ -5590,7 +5592,6 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
> > > > > >        if (likely(!mas_is_err(mas)))
> > > > > >            return 0;
> > > > > > 
> > > > > > -    mas_set_alloc_req(mas, 0);
> > 
> > Why did you drop this?  It seems like a worth while cleanup on failure.
> Because we will clear it in mas_node_count_gfp()->mas_alloc_nodes().

On failure we set the alloc request to the remainder of what was not
allocated.

> > 
> > > > > >        ret = xa_err(mas->node);
> > > > > >        mas_reset(mas);
> > > > > >        mas_destroy(mas);
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Regards
> > > > > > > Yin, Fengwei
> > > > > > > 
> > > > > > > > 
> > > > > > > > [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@intel.com/
> > > > > > > > 
> > > > > > > > Liam R. Howlett (14):
> > > > > > > >      maple_tree: Add benchmarking for mas_for_each
> > > > > > > >      maple_tree: Add benchmarking for mas_prev()
> > > > > > > >      mm: Move unmap_vmas() declaration to internal header
> > > > > > > >      mm: Change do_vmi_align_munmap() side tree index
> > > > > > > >      mm: Remove prev check from do_vmi_align_munmap()
> > > > > > > >      maple_tree: Introduce __mas_set_range()
> > > > > > > >      mm: Remove re-walk from mmap_region()
> > > > > > > >      maple_tree: Re-introduce entry to mas_preallocate() arguments
> > > > > > > >      mm: Use vma_iter_clear_gfp() in nommu
> > > > > > > >      mm: Set up vma iterator for vma_iter_prealloc() calls
> > > > > > > >      maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
> > > > > > > >      maple_tree: Update mas_preallocate() testing
> > > > > > > >      maple_tree: Refine mas_preallocate() node calculations
> > > > > > > >      mm/mmap: Change vma iteration order in do_vmi_align_munmap()
> > > > > > > > 
> > > > > > > >     fs/exec.c                        |   1 +
> > > > > > > >     include/linux/maple_tree.h       |  23 ++++-
> > > > > > > >     include/linux/mm.h               |   4 -
> > > > > > > >     lib/maple_tree.c                 |  78 ++++++++++----
> > > > > > > >     lib/test_maple_tree.c            |  74 +++++++++++++
> > > > > > > >     mm/internal.h                    |  40 ++++++--
> > > > > > > >     mm/memory.c                      |  16 ++-
> > > > > > > >     mm/mmap.c                        | 171 ++++++++++++++++---------------
> > > > > > > >     mm/nommu.c                       |  45 ++++----
> > > > > > > >     tools/testing/radix-tree/maple.c |  59 ++++++-----
> > > > > > > >     10 files changed, 331 insertions(+), 180 deletions(-)
> > > > > > > > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ