lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8de812c5-de4b-4d7e-b24a-5674e9fba33d@suse.cz>
Date: Tue, 9 Apr 2024 08:21:53 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Johannes Weiner <hannes@...xchg.org>,
 Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Mel Gorman <mgorman@...hsingularity.net>, Zi Yan <ziy@...dia.com>,
 "Huang, Ying" <ying.huang@...el.com>, David Hildenbrand <david@...hat.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block
 conversion

On 4/5/24 6:56 PM, Johannes Weiner wrote:
> Hi Baolin,
> 
> On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
>> On 2024/3/21 02:02, Johannes Weiner wrote:
>> > @@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
>> >   				return page;
>> >   		}
>> >   	}
>> > -retry:
>> > +
>> >   	page = __rmqueue_smallest(zone, order, migratetype);
>> >   	if (unlikely(!page)) {
>> >   		if (alloc_flags & ALLOC_CMA)
>> >   			page = __rmqueue_cma_fallback(zone, order);
>> > -
>> > -		if (!page && __rmqueue_fallback(zone, order, migratetype,
>> > -								alloc_flags))
>> > -			goto retry;
>> > +		else
>> > +			page = __rmqueue_fallback(zone, order, migratetype,
>> > +						  alloc_flags);
>> 
>> (Sorry for chim in late.)
>> 
>> I have some doubts about the changes here. The original code logic was 
>> that if the 'migratetype' type allocation is failed, it would first try 
>> CMA page allocation and then attempt to fallback to other migratetype 
>> allocations. Now it has been changed so that if CMA allocation fails, it 
>> will directly return. This change has caused a regression when I running 
>> the thpcompact benchmark, resulting in a significant reduction in the 
>> percentage of THPs like below:
>> 
>> thpcompact Percentage Faults Huge
>>                           K6.9-rc2                K6.9-rc2 + this patch
>> Percentage huge-1        78.18 (   0.00%)       42.49 ( -45.65%)
>> Percentage huge-3        86.70 (   0.00%)       35.13 ( -59.49%)
>> Percentage huge-5        90.26 (   0.00%)       52.35 ( -42.00%)
>> Percentage huge-7        92.34 (   0.00%)       31.84 ( -65.52%)
>> Percentage huge-12       91.18 (   0.00%)       45.85 ( -49.72%)
>> Percentage huge-18       89.00 (   0.00%)       29.18 ( -67.22%)
>> Percentage huge-24       90.52 (   0.00%)       46.68 ( -48.43%)
>> Percentage huge-30       94.44 (   0.00%)       38.35 ( -59.39%)
>> Percentage huge-32       93.09 (   0.00%)       39.37 ( -57.70%)
> 
> Ouch, sorry about that! I changed that specific part around later
> during development and didn't retest with CMA. I'll be sure to
> re-enable it again in my config.
> 
>> After making the following modifications, the regression is gone.
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ce67dc6777fa..a7cfe65e45c1 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order, 
>> int migratetype,
>>          if (unlikely(!page)) {
>>                  if (alloc_flags & ALLOC_CMA)
>>                          page = __rmqueue_cma_fallback(zone, order);
>> -               else
>> +
>> +               if (!page)
>>                          page = __rmqueue_fallback(zone, order, migratetype,
>>                                                    alloc_flags);
>>          }
>> 
>> But I am not sure your original change is intentional? IIUC, we still 
>> need try fallbacking even though CMA allocation is failed, please 
>> correct me if I misunderstand your code. Thanks.
> 
> No, this was accidental. I missed that CMA dependency when changing
> things around for the new return type of __rmqueue_fallback(). Your
> fix is good: just because the request qualifies for CMA doesn't mean
> it will succeed from that region. We need the fallback for those.
> 
> Andrew, could you please pick up Baolin's change for this patch?
> 
> [baolin.wang@...ux.alibaba.com: fix allocation failures with CONFIG_CMA]

Acked-by: Vlastimil Babka <vbabka@...e.cz>

> Thanks for debugging this and the fix, Baolin.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ