lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5282de6-4a44-45bf-95ed-7453da21191a@linux.alibaba.com>
Date: Sun, 7 Apr 2024 14:58:11 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...hsingularity.net>,
 Zi Yan <ziy@...dia.com>, "Huang, Ying" <ying.huang@...el.com>,
 David Hildenbrand <david@...hat.com>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block
 conversion



On 2024/4/6 00:56, Johannes Weiner wrote:
> Hi Baolin,
> 
> On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
>> On 2024/3/21 02:02, Johannes Weiner wrote:
>>> @@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
>>>    				return page;
>>>    		}
>>>    	}
>>> -retry:
>>> +
>>>    	page = __rmqueue_smallest(zone, order, migratetype);
>>>    	if (unlikely(!page)) {
>>>    		if (alloc_flags & ALLOC_CMA)
>>>    			page = __rmqueue_cma_fallback(zone, order);
>>> -
>>> -		if (!page && __rmqueue_fallback(zone, order, migratetype,
>>> -								alloc_flags))
>>> -			goto retry;
>>> +		else
>>> +			page = __rmqueue_fallback(zone, order, migratetype,
>>> +						  alloc_flags);
>>
>> (Sorry for chim in late.)
>>
>> I have some doubts about the changes here. The original code logic was
>> that if the 'migratetype' type allocation is failed, it would first try
>> CMA page allocation and then attempt to fallback to other migratetype
>> allocations. Now it has been changed so that if CMA allocation fails, it
>> will directly return. This change has caused a regression when I running
>> the thpcompact benchmark, resulting in a significant reduction in the
>> percentage of THPs like below:
>>
>> thpcompact Percentage Faults Huge
>>                            K6.9-rc2                K6.9-rc2 + this patch
>> Percentage huge-1        78.18 (   0.00%)       42.49 ( -45.65%)
>> Percentage huge-3        86.70 (   0.00%)       35.13 ( -59.49%)
>> Percentage huge-5        90.26 (   0.00%)       52.35 ( -42.00%)
>> Percentage huge-7        92.34 (   0.00%)       31.84 ( -65.52%)
>> Percentage huge-12       91.18 (   0.00%)       45.85 ( -49.72%)
>> Percentage huge-18       89.00 (   0.00%)       29.18 ( -67.22%)
>> Percentage huge-24       90.52 (   0.00%)       46.68 ( -48.43%)
>> Percentage huge-30       94.44 (   0.00%)       38.35 ( -59.39%)
>> Percentage huge-32       93.09 (   0.00%)       39.37 ( -57.70%)
> 
> Ouch, sorry about that! I changed that specific part around later
> during development and didn't retest with CMA. I'll be sure to
> re-enable it again in my config.
> 
>> After making the following modifications, the regression is gone.
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ce67dc6777fa..a7cfe65e45c1 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order,
>> int migratetype,
>>           if (unlikely(!page)) {
>>                   if (alloc_flags & ALLOC_CMA)
>>                           page = __rmqueue_cma_fallback(zone, order);
>> -               else
>> +
>> +               if (!page)
>>                           page = __rmqueue_fallback(zone, order, migratetype,
>>                                                     alloc_flags);
>>           }
>>
>> But I am not sure your original change is intentional? IIUC, we still
>> need try fallbacking even though CMA allocation is failed, please
>> correct me if I misunderstand your code. Thanks.
> 
> No, this was accidental. I missed that CMA dependency when changing
> things around for the new return type of __rmqueue_fallback(). Your
> fix is good: just because the request qualifies for CMA doesn't mean
> it will succeed from that region. We need the fallback for those.

OK. Thanks for the clarification.

> Andrew, could you please pick up Baolin's change for this patch?
> 
> [baolin.wang@...ux.alibaba.com: fix allocation failures with CONFIG_CMA]
> 
> Thanks for debugging this and the fix, Baolin.

My pleasure.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ