[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4dd0282-9d36-2398-5e8c-2ac5527744a0@linux.intel.com>
Date: Tue, 30 Jul 2019 15:25:42 -0700
From: sathyanarayanan kuppuswamy
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Dennis Zhou <dennis@...nel.org>
Cc: Dave Hansen <dave.hansen@...el.com>,
Uladzislau Rezki <urezki@...il.com>, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/1] mm/vmalloc.c: Fix percpu free VM area search
criteria
On 7/30/19 2:55 PM, Dennis Zhou wrote:
> On Tue, Jul 30, 2019 at 02:13:25PM -0700, sathyanarayanan kuppuswamy wrote:
>> On 7/30/19 1:54 PM, Dave Hansen wrote:
>>> On 7/30/19 1:46 PM, Uladzislau Rezki wrote:
>>>>> + /*
>>>>> + * If required width exeeds current VA block, move
>>>>> + * base downwards and then recheck.
>>>>> + */
>>>>> + if (base + end > va->va_end) {
>>>>> + base = pvm_determine_end_from_reverse(&va, align) - end;
>>>>> + term_area = area;
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> /*
>>>>> * If this VA does not fit, move base downwards and recheck.
>>>>> */
>>>>> - if (base + start < va->va_start || base + end > va->va_end) {
>>>>> + if (base + start < va->va_start) {
>>>>> va = node_to_va(rb_prev(&va->rb_node));
>>>>> base = pvm_determine_end_from_reverse(&va, align) - end;
>>>>> term_area = area;
>>>>> --
>>>>> 2.21.0
>>>>>
>>>> I guess it is NUMA related issue, i mean when we have several
>>>> areas/sizes/offsets. Is that correct?
>>> I don't think NUMA has anything to do with it. The vmalloc() area
>>> itself doesn't have any NUMA properties I can think of. We don't, for
>>> instance, partition it into per-node areas that I know of.
>>>
>>> I did encounter this issue on a system with ~100 logical CPUs, which is
>>> a moderate amount these days.
>> I agree with Dave. I don't think this issue is related to NUMA. The problem
>> here is about the logic we use to find appropriate vm_area that satisfies
>> the offset and size requirements of pcpu memory allocator.
>>
>> In my test case, I can reproduce this issue if we make request with offset
>> (ffff000000) and size (600000).
>>
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux kernel developer
>>
> I misspoke earlier. I don't think it's numa related either, but I think
> you could trigger this much more easily this way as it could skip more
> viable vma space because it'd have to find more holes.
>
> But it seems that pvm_determine_end_from_reverse() will return the free
> vma below the address if it is aligned so:
>
> base + end > va->va_end
>
> will always be true and then push down the searching va instead of using
> that va first.
It won't be always true. Initially base address is calculated as below:
base = pvm_determine_end_from_reverse(&va, align) - end;
So for first iteration it will not fail.
>
> Thanks,
> Dennis
>
--
Sathyanarayanan Kuppuswamy
Linux kernel developer
Powered by blists - more mailing lists