[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <74565216-4d21-3fdf-71d0-42770c2fa74f@amd.com>
Date: Tue, 29 Nov 2022 14:42:55 +0100
From: Christian König <christian.koenig@....com>
To: "Pan, Xinhui" <Xinhui.Pan@....com>,
"amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org>
Cc: "daniel@...ll.ch" <daniel@...ll.ch>,
"matthew.auld@...el.com" <matthew.auld@...el.com>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Paneer Selvam, Arunpravin" <Arunpravin.PaneerSelvam@....com>,
"intel-gfx@...ts.freedesktop.org" <intel-gfx@...ts.freedesktop.org>
Subject: Re: 回复: 回复: [PATCH v4] drm: Optimise for continuous memory allocation
Am 29.11.22 um 14:14 schrieb Pan, Xinhui:
> [AMD Official Use Only - General]
>
> comments line.
>
> ________________________________________
> 发件人: Koenig, Christian <Christian.Koenig@....com>
> 发送时间: 2022年11月29日 20:07
> 收件人: Pan, Xinhui; amd-gfx@...ts.freedesktop.org
> 抄送: daniel@...ll.ch; matthew.auld@...el.com; dri-devel@...ts.freedesktop.org; linux-kernel@...r.kernel.org; Paneer Selvam, Arunpravin; intel-gfx@...ts.freedesktop.org
> 主题: Re: 回复: [PATCH v4] drm: Optimise for continuous memory allocation
>
> Am 29.11.22 um 12:54 schrieb Pan, Xinhui:
>> [AMD Official Use Only - General]
>>
>> comments inline.
>>
>> ________________________________________
>> 发件人: Koenig, Christian <Christian.Koenig@....com>
>> 发送时间: 2022年11月29日 19:32
>> 收件人: Pan, Xinhui; amd-gfx@...ts.freedesktop.org
>> 抄送: daniel@...ll.ch; matthew.auld@...el.com; dri-devel@...ts.freedesktop.org; linux-kernel@...r.kernel.org; Paneer Selvam, Arunpravin; intel-gfx@...ts.freedesktop.org
>> 主题: Re: [PATCH v4] drm: Optimise for continuous memory allocation
>>
>> Am 29.11.22 um 11:56 schrieb xinhui pan:
>>> Currently drm-buddy does not have full knowledge of continuous memory.
>>>
>>> Lets consider scenario below.
>>> order 1: L R
>>> order 0: LL LR RL RR
>>> for order 1 allocation, it can offer L or R or LR+RL.
>>>
>>> For now, we only implement L or R case for continuous memory allocation.
>>> So this patch aims to implement the rest cases.
>>>
>>> Adding a new member leaf_link which links all leaf blocks in asceding
>>> order. Now we can find more than 2 sub-order blocks easier.
>>> Say, order 4 can be combined with corresponding order 4, 2+2, 1+2+1,
>>> 0+1+2+0, 0+2+1+0.
>> Well that description is a bit confusing and doesn't make to much sense
>> to me.
>>
>> When you have two adjacent free order 0 blocks then those should be
>> automatically combined into an order 1. This is a fundamental property
>> of the buddy allocator, otherwise the whole algorithm won't work.
>>
>> [xh] sorry, The order above is not 4, should be 3.
>> order 3 can be combined with corresponding order 3, 2+2, 1+2+1, 0+1+2+0, 0+2+1+0
>> the order 0 + 1 + 2 + 0 case does not have two same order 0 adjacent. they are in different tree.
>> looks like below
>> order 3: L3 R3
>> order 2: L2 (R2)* L2*
>> order 1: L1 (R1) L1
>> order 0: L0 (R0) (L0)
>> R0 + R1+R2 +L0 with () around combined to be order 3.
>> R2 + L2 with * followed combined to be order 3.
>> etc....
>>
>> When you have the case of a free order 1 block with two adjacent free
>> order 0 blocks then we a fragmented address space. In this case the best
>> approach is to fail the allocation and start to swap things out.
>>
>> [xh] Eviction is expensive.
> No, it isn't. Eviction is part of the algorithm to clean this up.
>
> When we can't find any free room then evicting and moving things back in
> is the best we can do to de-fragment the address space.
>
> This is expected behavior.
>
> [xh] I believe eviction is the best approach to cleanup memory.
> But as its cost is not cheap, it should be the final step.
> As long as we could find any room to satisfy the request, no need to trigger eviction.
>
> Just a test in theory
> two threads run parallelly.
> total memory is 128.
> while true {
> alloc 32
> alloc 32
> free 32
> free 32
> alloc 64
> free 64
> }
>
> when thread 0 wants to alloc 64, the memory layout might be
> (32) means allocated, _32_ means free.
> case 1: (32) _32_ _32_ (32)
> case 2: (32) _32_ (32) _32_
> case 3: (32) (32) _64_
> case 4: (32) _32_ 64_
> case 5: _128_
> case 6: (64) _64_
>
> without this patch, it would trigger eviction in case 1 and case 2.
> with this patch, it would trigger eviction only in case 2.
> obviously, the two threads totally consume memory at most 128 at any time, no overcommit.
> The eviction is the less the better.
No, once more: Eviction is part of why this works as it should.
In other words eviction is expected here and de-fragments the address
space into larger blocks.
This patch here breaks the general approach of the buddy allocator and
is a no-go as far as I can see.
If looking at adjacent blocks would come without extra cost then we
could consider it, but this here means extra overhead and complexity.
Regards,
Christian.
>
>
> Regards,
> Christian.
>
>> And if it still fails to find the continuous memory with this approach, then let's evict.
>>
>> So what exactly is the goal here?
>>
>> Regards,
>> Christian.
>>
>>> Signed-off-by: xinhui pan <xinhui.pan@....com>
>>> ---
>>> change from v3:
>>> reworked totally. adding leaf_link.
>>>
>>> change from v2:
>>> search continuous block in nearby root if needed
>>>
>>> change from v1:
>>> implement top-down continuous allocation
>>> ---
>>> drivers/gpu/drm/drm_buddy.c | 108 +++++++++++++++++++++++++++++++++---
>>> include/drm/drm_buddy.h | 1 +
>>> 2 files changed, 102 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>>> index 11bb59399471..8edafb99b02c 100644
>>> --- a/drivers/gpu/drm/drm_buddy.c
>>> +++ b/drivers/gpu/drm/drm_buddy.c
>>> @@ -80,6 +80,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size)
>>> {
>>> unsigned int i;
>>> u64 offset;
>>> + LIST_HEAD(leaf);
>>>
>>> if (size < chunk_size)
>>> return -EINVAL;
>>> @@ -136,6 +137,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size)
>>> goto out_free_roots;
>>>
>>> mark_free(mm, root);
>>> + list_add_tail(&root->leaf_link, &leaf);
>>>
>>> BUG_ON(i > mm->max_order);
>>> BUG_ON(drm_buddy_block_size(mm, root) < chunk_size);
>>> @@ -147,6 +149,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size)
>>> i++;
>>> } while (size);
>>>
>>> + list_del(&leaf);
>>> return 0;
>>>
>>> out_free_roots:
>>> @@ -205,6 +208,9 @@ static int split_block(struct drm_buddy *mm,
>>> mark_free(mm, block->left);
>>> mark_free(mm, block->right);
>>>
>>> + list_add(&block->right->leaf_link, &block->leaf_link);
>>> + list_add(&block->left->leaf_link, &block->leaf_link);
>>> + list_del(&block->leaf_link);
>>> mark_split(block);
>>>
>>> return 0;
>>> @@ -256,6 +262,9 @@ static void __drm_buddy_free(struct drm_buddy *mm,
>>> break;
>>>
>>> list_del(&buddy->link);
>>> + list_add(&parent->leaf_link, &block->leaf_link);
>>> + list_del(&buddy->leaf_link);
>>> + list_del(&block->leaf_link);
>>>
>>> drm_block_free(mm, block);
>>> drm_block_free(mm, buddy);
>>> @@ -386,6 +395,78 @@ alloc_range_bias(struct drm_buddy *mm,
>>> return ERR_PTR(err);
>>> }
>>>
>>> +static struct drm_buddy_block *
>>> +find_continuous_blocks(struct drm_buddy *mm,
>>> + int order,
>>> + unsigned long flags,
>>> + struct drm_buddy_block **rblock)
>>> +{
>>> + struct list_head *head = &mm->free_list[order];
>>> + struct drm_buddy_block *free_block, *max_block = NULL, *end, *begin;
>>> + u64 pages = BIT(order + 1);
>>> + u64 cur_pages;
>>> +
>>> + list_for_each_entry(free_block, head, link) {
>>> + if (max_block) {
>>> + if (!(flags & DRM_BUDDY_TOPDOWN_ALLOCATION))
>>> + break;
>>> +
>>> + if (drm_buddy_block_offset(free_block) <
>>> + drm_buddy_block_offset(max_block))
>>> + continue;
>>> + }
>>> +
>>> + cur_pages = BIT(order);
>>> + begin = end = free_block;
>>> + while (true) {
>>> + struct drm_buddy_block *prev, *next;
>>> + int prev_order, next_order;
>>> +
>>> + prev = list_prev_entry(begin, leaf_link);
>>> + if (!drm_buddy_block_is_free(prev) ||
>>> + drm_buddy_block_offset(prev) >
>>> + drm_buddy_block_offset(begin)) {
>>> + prev = NULL;
>>> + }
>>> + next = list_next_entry(end, leaf_link);
>>> + if (!drm_buddy_block_is_free(next) ||
>>> + drm_buddy_block_offset(next) <
>>> + drm_buddy_block_offset(end)) {
>>> + next = NULL;
>>> + }
>>> + if (!prev && !next)
>>> + break;
>>> +
>>> + prev_order = prev ? drm_buddy_block_order(prev) : -1;
>>> + next_order = next ? drm_buddy_block_order(next) : -1;
>>> + if (next_order >= prev_order) {
>>> + BUG_ON(drm_buddy_block_offset(end) +
>>> + drm_buddy_block_size(mm, end) !=
>>> + drm_buddy_block_offset(next));
>>> + end = next;
>>> + cur_pages += BIT(drm_buddy_block_order(next));
>>> + }
>>> + if (prev_order >= next_order) {
>>> + BUG_ON(drm_buddy_block_offset(prev) +
>>> + drm_buddy_block_size(mm, prev) !=
>>> + drm_buddy_block_offset(begin));
>>> + begin = prev;
>>> + cur_pages += BIT(drm_buddy_block_order(prev));
>>> + }
>>> + if (pages == cur_pages)
>>> + break;
>>> + BUG_ON(pages < cur_pages);
>>> + }
>>> +
>>> + if (pages > cur_pages)
>>> + continue;
>>> +
>>> + *rblock = end;
>>> + max_block = begin;
>>> + }
>>> + return max_block;
>>> +}
>>> +
>>> static struct drm_buddy_block *
>>> get_maxblock(struct list_head *head)
>>> {
>>> @@ -637,7 +718,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
>>> struct list_head *blocks,
>>> unsigned long flags)
>>> {
>>> - struct drm_buddy_block *block = NULL;
>>> + struct drm_buddy_block *block = NULL, *rblock = NULL;
>>> unsigned int min_order, order;
>>> unsigned long pages;
>>> LIST_HEAD(allocated);
>>> @@ -689,17 +770,30 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
>>> break;
>>>
>>> if (order-- == min_order) {
>>> + if (!(flags & DRM_BUDDY_RANGE_ALLOCATION) &&
>>> + min_order != 0 && pages == BIT(order + 1)) {
>>> + block = find_continuous_blocks(mm,
>>> + order,
>>> + flags,
>>> + &rblock);
>>> + if (block)
>>> + break;
>>> + }
>>> err = -ENOSPC;
>>> goto err_free;
>>> }
>>> } while (1);
>>>
>>> - mark_allocated(block);
>>> - mm->avail -= drm_buddy_block_size(mm, block);
>>> - kmemleak_update_trace(block);
>>> - list_add_tail(&block->link, &allocated);
>>> -
>>> - pages -= BIT(order);
>>> + do {
>>> + mark_allocated(block);
>>> + mm->avail -= drm_buddy_block_size(mm, block);
>>> + kmemleak_update_trace(block);
>>> + list_add_tail(&block->link, &allocated);
>>> + pages -= BIT(drm_buddy_block_order(block));
>>> + if (block == rblock || !rblock)
>>> + break;
>>> + block = list_next_entry(block, leaf_link);
>>> + } while (true);
>>>
>>> if (!pages)
>>> break;
>>> diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
>>> index 572077ff8ae7..c5437bd4f4f3 100644
>>> --- a/include/drm/drm_buddy.h
>>> +++ b/include/drm/drm_buddy.h
>>> @@ -50,6 +50,7 @@ struct drm_buddy_block {
>>> */
>>> struct list_head link;
>>> struct list_head tmp_link;
>>> + struct list_head leaf_link;
>>> };
>>>
>>> /* Order-zero must be at least PAGE_SIZE */
Powered by blists - more mailing lists