linux-kernel - Re: [PATCH 00/14] Reduce preallocations for maple tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fb5f66d-c8c2-f857-7461-b974154dbc2b@bytedance.com>
Date:   Mon, 5 Jun 2023 15:59:32 +0800
From:   Peng Zhang <zhangpeng.00@...edance.com>
To:     "Yin, Fengwei" <fengwei.yin@...el.com>
Cc:     maple-tree@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, "Liu, Yujie" <yujie.liu@...el.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peng Zhang <zhangpeng.00@...edance.com>
Subject: Re: [PATCH 00/14] Reduce preallocations for maple tree



在 2023/6/5 14:18, Yin, Fengwei 写道:
> 
> 
> On 6/5/2023 12:41 PM, Yin Fengwei wrote:
>> Hi Peng,
>>
>> On 6/5/23 11:28, Peng Zhang wrote:
>>>
>>>
>>> 在 2023/6/2 16:10, Yin, Fengwei 写道:
>>>> Hi Liam,
>>>>
>>>> On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
>>>>> Initial work on preallocations showed no regression in performance
>>>>> during testing, but recently some users (both on [1] and off [android]
>>>>> list) have reported that preallocating the worst-case number of nodes
>>>>> has caused some slow down.  This patch set addresses the number of
>>>>> allocations in a few ways.
>>>>>
>>>>> During munmap() most munmap() operations will remove a single VMA, so
>>>>> leverage the fact that the maple tree can place a single pointer at
>>>>> range 0 - 0 without allocating.  This is done by changing the index in
>>>>> the 'sidetree'.
>>>>>
>>>>> Re-introduce the entry argument to mas_preallocate() so that a more
>>>>> intelligent guess of the node count can be made.
>>>>>
>>>>> Patches are in the following order:
>>>>> 0001-0002: Testing framework for benchmarking some operations
>>>>> 0003-0004: Reduction of maple node allocation in sidetree
>>>>> 0005:      Small cleanup of do_vmi_align_munmap()
>>>>> 0006-0013: mas_preallocate() calculation change
>>>>> 0014:      Change the vma iterator order
>>>> I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
>>>> this patchset.
>>>>
>>>> The result has a little bit improvement:
>>>> Base (next-20230602):
>>>>     503880
>>>> Base with this patchset:
>>>>     519501
>>>>
>>>> But they are far from the none-regression result (commit 7be1c1a3c7b1):
>>>>     718080
>>>>
>>>>
>>>> Some other information I collected:
>>>> With Base, the mas_alloc_nodes are always hit with request: 7.
>>>> With this patchset, the request are 1 or 5.
>>>>
>>>> I suppose this is the reason for improvement from 503880 to 519501.
>>>>
>>>> With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
>>>> mas_alloc_nodes() call. Thanks.
>>> Hi Fengwei,
>>>
>>> I think it may be related to the inaccurate number of nodes allocated
>>> in the pre-allocation. I slightly modified the pre-allocation in this
>>> patchset, but I don't know if it works. It would be great if you could
>>> help test it, and help pinpoint the cause. Below is the diff, which can
>>> be applied based on this pachset.
>> I tried the patch, it could eliminate the call of mas_alloc_nodes() during
>> the test. But the result of benchmark got a little bit improvement:
>>    529040
>>
>> But it's still much less than none-regression result. I will also double
>> confirm the none-regression result.
> Just noticed that the commit f5715584af95 make validate_mm() two implementation
> based on CONFIG_DEBUG_VM instead of CONFIG_DEBUG_VM_MAPPLE_TREE). I have
> CONFIG_DEBUG_VM but not CONFIG_DEBUG_VM_MAPPLE_TREE defined. So it's not an
> apple to apple.
> 
> 
> I disable CONFIG_DEBUG_VM and re-run the test and got:
> Before preallocation change (7be1c1a3c7b1):
>      770100
> After preallocation change (28c5609fb236):
>      680000
> With liam's fix:
>      702100
> plus Peng's fix:
>      725900
Thank you for your test, now it seems that the performance
regression is not so much.
> 
> 
> Regards
> Yin, Fengwei
> 
>>
>>
>> Regards
>> Yin, Fengwei
>>
>>>
>>> Thanks,
>>> Peng
>>>
>>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>>> index 5ea211c3f186..e67bf2744384 100644
>>> --- a/lib/maple_tree.c
>>> +++ b/lib/maple_tree.c
>>> @@ -5575,9 +5575,11 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>>>           goto ask_now;
>>>       }
>>>
>>> -    /* New root needs a singe node */
>>> -    if (unlikely(mte_is_root(mas->node)))
>>> -        goto ask_now;
>>> +    if ((node_size == wr_mas.node_end + 1 &&
>>> +         mas->offset == wr_mas.node_end) ||
>>> +        (node_size == wr_mas.node_end &&
>>> +         wr_mas.offset_end - mas->offset == 1))
>>> +        return 0;
>>>
>>>       /* Potential spanning rebalance collapsing a node, use worst-case */
>>>       if (node_size  - 1 <= mt_min_slots[wr_mas.type])
>>> @@ -5590,7 +5592,6 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>>>       if (likely(!mas_is_err(mas)))
>>>           return 0;
>>>
>>> -    mas_set_alloc_req(mas, 0);
>>>       ret = xa_err(mas->node);
>>>       mas_reset(mas);
>>>       mas_destroy(mas);
>>>
>>>
>>>>
>>>>
>>>> Regards
>>>> Yin, Fengwei
>>>>
>>>>>
>>>>> [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@intel.com/
>>>>>
>>>>> Liam R. Howlett (14):
>>>>>     maple_tree: Add benchmarking for mas_for_each
>>>>>     maple_tree: Add benchmarking for mas_prev()
>>>>>     mm: Move unmap_vmas() declaration to internal header
>>>>>     mm: Change do_vmi_align_munmap() side tree index
>>>>>     mm: Remove prev check from do_vmi_align_munmap()
>>>>>     maple_tree: Introduce __mas_set_range()
>>>>>     mm: Remove re-walk from mmap_region()
>>>>>     maple_tree: Re-introduce entry to mas_preallocate() arguments
>>>>>     mm: Use vma_iter_clear_gfp() in nommu
>>>>>     mm: Set up vma iterator for vma_iter_prealloc() calls
>>>>>     maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
>>>>>     maple_tree: Update mas_preallocate() testing
>>>>>     maple_tree: Refine mas_preallocate() node calculations
>>>>>     mm/mmap: Change vma iteration order in do_vmi_align_munmap()
>>>>>
>>>>>    fs/exec.c                        |   1 +
>>>>>    include/linux/maple_tree.h       |  23 ++++-
>>>>>    include/linux/mm.h               |   4 -
>>>>>    lib/maple_tree.c                 |  78 ++++++++++----
>>>>>    lib/test_maple_tree.c            |  74 +++++++++++++
>>>>>    mm/internal.h                    |  40 ++++++--
>>>>>    mm/memory.c                      |  16 ++-
>>>>>    mm/mmap.c                        | 171 ++++++++++++++++---------------
>>>>>    mm/nommu.c                       |  45 ++++----
>>>>>    tools/testing/radix-tree/maple.c |  59 ++++++-----
>>>>>    10 files changed, 331 insertions(+), 180 deletions(-)
>>>>>