[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231024083258.65750-1-zhangpeng.00@bytedance.com>
Date: Tue, 24 Oct 2023 16:32:48 +0800
From: Peng Zhang <zhangpeng.00@...edance.com>
To: Liam.Howlett@...cle.com, corbet@....net, akpm@...ux-foundation.org,
willy@...radead.org, brauner@...nel.org, surenb@...gle.com,
michael.christie@...cle.com, mjguzik@...il.com,
mathieu.desnoyers@...icios.com, npiggin@...il.com,
peterz@...radead.org, oliver.sang@...el.com, mst@...hat.com
Cc: zhangpeng.00@...edance.com, maple-tree@...ts.infradead.org,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: [PATCH v6 00/10] Introduce __mt_dup() to improve the performance of fork()
Hi all,
This series introduces __mt_dup() to improve the performance of fork(). During
the duplication process of mmap, all VMAs are traversed and inserted one by one
into the new maple tree, causing the maple tree to be rebalanced multiple times.
Balancing the maple tree is a costly operation. To duplicate VMAs more
efficiently, mtree_dup() and __mt_dup() are introduced for the maple tree. They
can efficiently duplicate a maple tree.
Here are some algorithmic details about {mtree,__mt}_dup(). We perform a DFS
pre-order traversal of all nodes in the source maple tree. During this process,
we fully copy the nodes from the source tree to the new tree. This involves
memory allocation, and when encountering a new node, if it is a non-leaf node,
all its child nodes are allocated at once.
This idea was originally from Liam R. Howlett's Maple Tree Work email, and I
added some of my own ideas to implement it. Some previous discussions can be
found in [1]. For a more detailed analysis of the algorithm, please refer to the
logs for patch [3/10] and patch [10/10].
There is a "spawn" in byte-unixbench[2], which can be used to test the
performance of fork(). I modified it slightly to make it work with
different number of VMAs.
Below are the test results. The first row shows the number of VMAs.
The second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively. The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results. There are still some
fluctuations in the test results, but at least they are better than the
original performance.
21 121 221 421 821 1621 3221 6421 12821 25621 51221
112100 76261 54227 34035 20195 11112 6017 3161 1606 802 393
114558 83067 65008 45824 28751 16072 8922 4747 2436 1233 599
2.19% 8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%
Thanks to Liam and Matthew for the review.
Changes since v5:
- Correct the copyright statement.
- Add Suggested-by tag in patch [3/10] and [10/10], this work was originally
proposed by Liam R. Howlett.
- Some cleanup and comment corrections for patch [3/10].
- Use vma_iter* series interfaces as much as possible in [10/10].
[1] https://lore.kernel.org/lkml/463899aa-6cbd-f08e-0aca-077b0e4e4475@bytedance.com/
[2] https://github.com/kdlucas/byte-unixbench/tree/master
v1: https://lore.kernel.org/lkml/20230726080916.17454-1-zhangpeng.00@bytedance.com/
v2: https://lore.kernel.org/lkml/20230830125654.21257-1-zhangpeng.00@bytedance.com/
v3: https://lore.kernel.org/lkml/20230925035617.84767-1-zhangpeng.00@bytedance.com/
v4: https://lore.kernel.org/lkml/20231009090320.64565-1-zhangpeng.00@bytedance.com/
v5: https://lore.kernel.org/lkml/20231016032226.59199-1-zhangpeng.00@bytedance.com/
Peng Zhang (10):
maple_tree: Add mt_free_one() and mt_attr() helpers
maple_tree: Introduce {mtree,mas}_lock_nested()
maple_tree: Introduce interfaces __mt_dup() and mtree_dup()
radix tree test suite: Align kmem_cache_alloc_bulk() with kernel
behavior.
maple_tree: Add test for mtree_dup()
maple_tree: Update the documentation of maple tree
maple_tree: Skip other tests when BENCH is enabled
maple_tree: Update check_forking() and bench_forking()
maple_tree: Preserve the tree attributes when destroying maple tree
fork: Use __mt_dup() to duplicate maple tree in dup_mmap()
Documentation/core-api/maple_tree.rst | 4 +
include/linux/maple_tree.h | 7 +
include/linux/mm.h | 11 +
kernel/fork.c | 40 ++-
lib/maple_tree.c | 290 +++++++++++++++++++-
lib/test_maple_tree.c | 123 +++++----
mm/internal.h | 11 -
mm/memory.c | 7 +-
mm/mmap.c | 9 +-
tools/include/linux/rwsem.h | 4 +
tools/include/linux/spinlock.h | 1 +
tools/testing/radix-tree/linux.c | 45 +++-
tools/testing/radix-tree/maple.c | 363 ++++++++++++++++++++++++++
13 files changed, 813 insertions(+), 102 deletions(-)
--
2.20.1
Powered by blists - more mailing lists