[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <821c666b-ddf8-8b5c-1e8c-69a06ae1c727@codeaurora.org>
Date: Mon, 11 May 2020 16:44:06 +0530
From: Vijayanand Jitta <vjitta@...eaurora.org>
To: Robin Murphy <robin.murphy@....com>, joro@...tes.org,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Cc: vinmenon@...eaurora.org, kernel-team@...roid.com
Subject: Re: [PATCH] iommu/iova: Retry from last rb tree node if iova search
fails
On 5/9/2020 12:25 AM, Vijayanand Jitta wrote:
>
>
> On 5/7/2020 6:54 PM, Robin Murphy wrote:
>> On 2020-05-06 9:01 pm, vjitta@...eaurora.org wrote:
>>> From: Vijayanand Jitta <vjitta@...eaurora.org>
>>>
>>> When ever a new iova alloc request comes iova is always searched
>>> from the cached node and the nodes which are previous to cached
>>> node. So, even if there is free iova space available in the nodes
>>> which are next to the cached node iova allocation can still fail
>>> because of this approach.
>>>
>>> Consider the following sequence of iova alloc and frees on
>>> 1GB of iova space
>>>
>>> 1) alloc - 500MB
>>> 2) alloc - 12MB
>>> 3) alloc - 499MB
>>> 4) free - 12MB which was allocated in step 2
>>> 5) alloc - 13MB
>>>
>>> After the above sequence we will have 12MB of free iova space and
>>> cached node will be pointing to the iova pfn of last alloc of 13MB
>>> which will be the lowest iova pfn of that iova space. Now if we get an
>>> alloc request of 2MB we just search from cached node and then look
>>> for lower iova pfn's for free iova and as they aren't any, iova alloc
>>> fails though there is 12MB of free iova space.
>>
>> Yup, this could definitely do with improving. Unfortunately I think this
>> particular implementation is slightly flawed...
>>
>>> To avoid such iova search failures do a retry from the last rb tree node
>>> when iova search fails, this will search the entire tree and get an iova
>>> if its available
>>>
>>> Signed-off-by: Vijayanand Jitta <vjitta@...eaurora.org>
>>> ---
>>> drivers/iommu/iova.c | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index 0e6a953..2985222 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct
>>> iova_domain *iovad,
>>> unsigned long flags;
>>> unsigned long new_pfn;
>>> unsigned long align_mask = ~0UL;
>>> + bool retry = false;
>>> if (size_aligned)
>>> align_mask <<= fls_long(size - 1);
>>> @@ -198,6 +199,8 @@ static int __alloc_and_insert_iova_range(struct
>>> iova_domain *iovad,
>>> curr = __get_cached_rbnode(iovad, limit_pfn);
>>> curr_iova = rb_entry(curr, struct iova, node);
>>> +
>>> +retry_search:
>>> do {
>>> limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
>>> new_pfn = (limit_pfn - size) & align_mask;
>>> @@ -207,6 +210,14 @@ static int __alloc_and_insert_iova_range(struct
>>> iova_domain *iovad,
>>> } while (curr && new_pfn <= curr_iova->pfn_hi);
>>> if (limit_pfn < size || new_pfn < iovad->start_pfn) {
>>> + if (!retry) {
>>> + curr = rb_last(&iovad->rbroot);
>>
>> Why walk when there's an anchor node there already? However...
>>
>>> + curr_iova = rb_entry(curr, struct iova, node);
>>> + limit_pfn = curr_iova->pfn_lo;
>>
>> ...this doesn't look right, as by now we've lost the original limit_pfn
>> supplied by the caller, so are highly likely to allocate beyond the
>> range our caller asked for. In fact AFAICS we'd start allocating from
>> directly directly below the anchor node, beyond the end of the entire
>> address space.
>>
>> The logic I was imagining we want here was something like the rapidly
>> hacked up (and untested) diff below.
>>
>> Thanks,
>> Robin.
>>
>
> Thanks for your comments ,I have gone through below logic and I see some
> issue with retry check as there could be case where alloc_lo is set to
> some pfn other than start_pfn in that case we don't retry and there can
> still be iova available. I understand its a hacked up version, I can
> work on this.
>
> But how about we just store limit_pfn and get the node using that and
> retry for once from that node, it would be similar to my patch just
> correcting the curr node and limit_pfn update in retry check. do you see
> any issue with this approach ?
>
>
> Thanks,
> Vijay.
I found one issue with my earlier approach, where we search twice from
cached node to the start_pfn, this can be avoided if we store the pfn_hi
of the cached node make this as alloc_lo when we retry. I see the below
diff also does the same, I have posted v2 version of the patch after
going through the comments and the below diff. can you please review that.
Thanks,
Vijay
>> ----->8-----
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 0e6a9536eca6..3574c19272d6 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>> unsigned long flags;
>> unsigned long new_pfn;
>> unsigned long align_mask = ~0UL;
>> + unsigned long alloc_hi, alloc_lo;
>>
>> if (size_aligned)
>> align_mask <<= fls_long(size - 1);
>> @@ -196,17 +197,27 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>> size >= iovad->max32_alloc_size)
>> goto iova32_full;
>>
>> + alloc_hi = IOVA_ANCHOR;
>> + alloc_lo = iovad->start_pfn;
>> +retry:
>> curr = __get_cached_rbnode(iovad, limit_pfn);
>> curr_iova = rb_entry(curr, struct iova, node);
>> + if (alloc_hi < curr_iova->pfn_hi) {
>> + alloc_lo = curr_iova->pfn_hi;
>> + alloc_hi = limit_pfn;
>> + }
>> +
>> do {
>> - limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
>> - new_pfn = (limit_pfn - size) & align_mask;
>> + alloc_hi = min(alloc_hi, curr_iova->pfn_lo);
>> + new_pfn = (alloc_hi - size) & align_mask;
>> prev = curr;
>> curr = rb_prev(curr);
>> curr_iova = rb_entry(curr, struct iova, node);
>> } while (curr && new_pfn <= curr_iova->pfn_hi);
>>
>> - if (limit_pfn < size || new_pfn < iovad->start_pfn) {
>> + if (limit_pfn < size || new_pfn < alloc_lo) {
>> + if (alloc_lo == iovad->start_pfn)
>> + goto retry;
>> iovad->max32_alloc_size = size;
>> goto iova32_full;
>> }
Powered by blists - more mailing lists