[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <689b7c24-623d-c01e-6c0f-ad430f1fa3ae@redhat.com>
Date: Wed, 29 Sep 2021 17:05:08 +0200
From: David Hildenbrand <david@...hat.com>
To: Uladzislau Rezki <urezki@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Ping Fang <pifang@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...e.com>,
Oscar Salvador <osalvador@...e.de>,
Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: [PATCH v1] mm/vmalloc: fix exact allocations with an alignment >
1
On 29.09.21 16:49, Uladzislau Rezki wrote:
> On Wed, Sep 29, 2021 at 4:40 PM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 29.09.21 16:30, Uladzislau Rezki wrote:
>>>>
>>>> So the idea is that once we run into a dead end because we took a left
>>>> subtree, we rollback to the next possible rigth subtree and try again.
>>>> If we run into another dead end, we repeat ... thus, this can now happen
>>>> more than once.
>>>>
>>>> I assume the only implication is that this can now be slower in some
>>>> corner cases with larger alignment, because it might take longer to find
>>>> something suitable. Fair enough.
>>>>
>>> Yep, your understanding is correct regarding the tree traversal. If no
>>> suitable block
>>> is found in left sub-tree we roll-back and check right one. So it can
>>> be(the scanning)
>>> more than one time.
>>>
>>> I did some performance analyzing using vmalloc test suite to figure
>>> out a performance
>>> loss for allocations with specific alignment. On that syntactic test i
>>> see approx. 30%
>>> of degradation:
>>
>> How realistic is that test case? I assume most alignment we're dealing
>> with is:
>> * 1/PAGE_SIZE
>> * huge page size (for automatic huge page placing)
>>
> Well that is synthetic test. Most of the alignments are 1 or PAGE_SIZE.
> There are users which use internal API where you can specify an alignment
> you want but those are mainly like KASAN, module alloc, etc.
>
>>>
>>> 2.225 microseconds vs 1.496 microseconds. That time includes both
>>> vmalloc() and vfree()
>>> calls. I do not consider it as a big degrade, but from the other hand
>>> we can still adjust the
>>> search length for alignments > one page:
>>>
>>> # add it on top of previous proposal and search length instead of size
>>> length = align > PAGE_SIZE ? size + align:size;
>>
>> That will not allow to place huge pages in the case of kasan. And I
>> consider that more important than optimizing a syntactic test :) My 2 cents.
>>
> Could you please to be more specific? I mean how is it connected with huge
> pages mappings? Huge-pages are which have order > 0. Or you mean that
> a special alignments are needed for mapping huge pages?
Let me try to clarify:
KASAN does an exact allocation when onlining a memory block,
__vmalloc_node_range() will try placing huge pages first, increasing the
alignment to e.g., "1 << PMD_SHIFT".
If we increase the search length in find_vmap_lowest_match(), that
search will fail if the exact allocation is surrounded by other
allocations. In that case, we won't place a huge page although we could
-- because find_vmap_lowest_match() would be imprecise for alignments >
PAGE_SIZE.
Memory blocks we online/offline on x86 are at least 128MB. The KASAN
"overhead" we have to allocate is 1/8 of that -- 16 MB, so essentially 8
huge pages.
__vmalloc_node_range() will increase the alignment to 2MB to try placing
huge pages first. find_vmap_lowest_match() will search within the given
exact 16MB are a 18MB area (size + align), which won't work. So
__vmalloc_node_range() will fallback to the original PAGE_SIZE alignment
and shift=PAGE_SHIFT.
__vmalloc_area_node() will set the set_vm_area_page_order effectively to
0 -- small pages.
Does that make sense or am I missing something?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists