lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74e8665a-3818-4e6c-83d1-9a0220a79e49@oracle.com>
Date: Mon, 18 Nov 2024 11:36:18 -0500
From: Sidhartha Kumar <sidhartha.kumar@...cle.com>
To: Wei Yang <richard.weiyang@...il.com>
Cc: linux-kernel@...r.kernel.org, maple-tree@...ts.infradead.org,
        linux-mm@...ck.org, akpm@...ux-foundation.org, liam.howlett@...cle.com
Subject: Re: [PATCH 3/5] maple_tree: use vacant nodes to reduce worst case
 allocations

On 11/15/24 8:41 PM, Wei Yang wrote:
> On Fri, Nov 15, 2024 at 03:34:55PM -0500, Sidhartha Kumar wrote:
>> On 11/15/24 2:52 AM, Wei Yang wrote:
>>> On Thu, Nov 14, 2024 at 12:05:22PM -0500, Sidhartha Kumar wrote:
>>>> In order to determine the store type for a maple tree operation, a walk
>>>> of the tree is done through mas_wr_walk(). This function descends the
>>>> tree until a spanning write is detected or we reach a leaf node. While
>>>> descending, keep track of the height at which we encounter a node with
>>>> available space. This is done by checking if mas->end is less than the
>>>> number of slots a given node type can fit.
>>>>
>>>> Now that the height of the vacant node is tracked, we can use the
>>>> difference between the height of the tree and the height of the vacant
>>>> node to know how many levels we will have to propagate creating new
>>>> nodes. Update mas_prealloc_calc() to consider the vacant height and
>>>> reduce the number of worst allocations.
>>>>
>>>> Rebalancing stores are not supported and fall back to using the full
>>>> height of the tree for allocations.
>>>>
>>>> Update preallocation testing assertions to take into account vacant
>>>> height.
>>>>
>>>> Signed-off-by: Sidhartha <sidhartha.kumar@...cle.com>
>>>> ---
>>>> include/linux/maple_tree.h       |  2 +
>>>> lib/maple_tree.c                 | 13 +++--
>>>> tools/testing/radix-tree/maple.c | 97 +++++++++++++++++++++++++++++---
>>>> 3 files changed, 100 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
>>>> index cbbcd18d4186..7d777aa2d9ed 100644
>>>> --- a/include/linux/maple_tree.h
>>>> +++ b/include/linux/maple_tree.h
>>>> @@ -463,6 +463,7 @@ struct ma_wr_state {
>>>> 	void __rcu **slots;		/* mas->node->slots pointer */
>>>> 	void *entry;			/* The entry to write */
>>>> 	void *content;			/* The existing entry that is being overwritten */
>>>> +	unsigned char vacant_height;	/* Depth of lowest node with free space */
>>>                                ^^^           ^^^
>>>
>>> Would this be a little misleading?
>>>
>>
>> Could you elaborate on how its misleading?
>>
> 
> As you mentioned in previous patch, depth and height has different meaning.
> 
> Root node has depth of 0 and height of 1. So I may wandering whether this is
> depth or height.
> 
>>>> };
>>>>
>>>> #define mas_lock(mas)           spin_lock(&((mas)->tree->ma_lock))
>>>> @@ -498,6 +499,7 @@ struct ma_wr_state {
>>>> 		.mas = ma_state,					\
>>>> 		.content = NULL,					\
>>>> 		.entry = wr_entry,					\
>>>> +		.vacant_height = 0					\
>>>> 	}
>>>>
>>>> #define MA_TOPIARY(name, tree)						\
>>>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>>>> index 21289e350382..f14d70c171c2 100644
>>>> --- a/lib/maple_tree.c
>>>> +++ b/lib/maple_tree.c
>>>> @@ -3545,6 +3545,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas)
>>>> 		if (ma_is_leaf(wr_mas->type))
>>>> 			return true;
>>>>
>>>> +		if (mas->end < mt_slots[wr_mas->type] - 1)
>>>> +			wr_mas->vacant_height = mas->depth + 1;
>>>
>>> For some cases in rebalance, we may split data into three parts, which means
>>> we need 2 extra vacant slot.
>>>
>>> Maybe this check is not accurate?
>>>
>>
>> The triple split scenario which you are describing comes from the spanning
>> store case not on the wr_rebalance case. There is a check before we set
>> vacant height to return if is_span_wr() so I believe this is correct still.
>>
> 
> Hmm... I come up with a case in which vacant_height may not high enough.
> 
> Here is the subtree where spanning write locates. The first level is the
> parent node of the second level nodes.
> 
>                  vacant node
>                  +--------+-+-+-------+
>                  |        |l|r|       |
>                  +--------+-+-+-------+
> 
>                  l                 r
>      +------+    +----+-------+    +---------+--+    +------+
>      |      |    |    |       |    |         |  |    |      |
>      +------+    +----+-------+    +---------+--+    +------+
>                       ^                      ^
> 		     |                      |
> 		   index                   last
> 
> When mas_wr_walk_descend() to node l, mas_is_span_wr() return true since last
> is in the right sibling, node r. Let's say the parent is vacant and l/r is
> leaf. So the request number is (1 * 3) + 1.
> 
> Let's assume:
> 
>    * vacant node is just sufficient
>    * l's left part + r's right part is sufficient but not overflow
> 
> Then the "new vacant node" would be deficient and needs another round
> rebalance.
> 
> For this case, I am afraid it doesn't allocate enough nodes. Or I
> misunderstand this?

I think you are correct and we need to use the sufficient height which 
is introduced in patch 5 in the spanning store case similar to how patch 
5 uses it for the rebalance store case.

	case wr_rebalance:
		if (wr_mas->sufficient_height < wr_mas->vacant_height) {
			ret = (height - wr_mas->sufficient_height)*2 +1;
			break;
		}
		ret = delta * 2 + 1;
		break;

> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ