[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9df905cf-cc4f-c739-26cb-c2e5c6e5a234@redhat.com>
Date: Thu, 22 Apr 2021 20:35:27 +0200
From: David Hildenbrand <david@...hat.com>
To: Florian Fainelli <f.fainelli@...il.com>,
Michal Hocko <mhocko@...e.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...e.de>,
Minchan Kim <minchan@...nel.org>,
Johannes Weiner <hannes@...xchg.org>, l.stach@...gutronix.de,
LKML <linux-kernel@...r.kernel.org>,
Jaewon Kim <jaewon31.kim@...sung.com>,
Michal Nazarewicz <mina86@...a86.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Oscar Salvador <OSalvador@...e.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: alloc_contig_range() with MIGRATE_MOVABLE performance regression
since 4.9
On 22.04.21 19:50, Florian Fainelli wrote:
>
>
> On 4/22/2021 1:56 AM, David Hildenbrand wrote:
>> On 22.04.21 09:49, Michal Hocko wrote:
>>> Cc David and Oscar who are familiar with this code as well.
>>>
>>> On Wed 21-04-21 11:36:01, Florian Fainelli wrote:
>>>> Hi all,
>>>>
>>>> I have been trying for the past few days to identify the source of a
>>>> performance regression that we are seeing with the 5.4 kernel but not
>>>> with the 4.9 kernel on ARM64. Testing something newer like 5.10 is a bit
>>>> challenging at the moment but will happen eventually.
>>>>
>>>> What we are seeing is a ~3x increase in the time needed for
>>>> alloc_contig_range() to allocate 1GB in blocks of 2MB pages. The system
>>>> is idle at the time and there are no other contenders for memory other
>>>> than the user-space programs already started (DHCP client, shell, etc.).
>>
>> Hi,
>>
>> If you can easily reproduce it might be worth to just try bisecting;
>> that could be faster than manually poking around in the code.
>>
>> Also, it would be worth having a look at the state of upstream Linux.
>> Upstream Linux developers tend to not care about minor performance
>> regressions on oldish kernels.
>
> This is a big pain point here and I cannot agree more, but until we
> bridge that gap, this is not exactly easy to do for me unfortunately and
> neither is bisection :/
>
>>
>> There has been work on improving exactly the situation you are
>> describing -- a "fail fast" / "no retry" mode for alloc_contig_range().
>> Maybe it tackles exactly this issue.
>>
>> https://lkml.kernel.org/r/20210121175502.274391-3-minchan@kernel.org
>>
>> Minchan is already on cc.
>
> This patch does not appear to be helping, in fact, I had locally applied
> this patch from way back when:
>
> https://lkml.org/lkml/2014/5/28/113
>
> which would effectively do this unconditionally. Let me see if I can
> showcase this problem a x86 virtual machine operating in similar
> conditions to ours.
How exactly are you allocating these 2MiB blocks?
Via CMA->alloc_contig_range() or via alloc_contig_range() directly? I
assume via CMA.
For
https://lkml.kernel.org/r/20210121175502.274391-3-minchan@kernel.org
to do its work you'll have to pass __GFP_NORETRY to
alloc_contig_range(). This requires CMA adaptions, from where we call
alloc_contig_range().
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists