lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Apr 2024 12:56:32 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Vlastimil Babka <vbabka@...e.cz>,
	Mel Gorman <mgorman@...hsingularity.net>, Zi Yan <ziy@...dia.com>,
	"Huang, Ying" <ying.huang@...el.com>,
	David Hildenbrand <david@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block
 conversion

Hi Baolin,

On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
> On 2024/3/21 02:02, Johannes Weiner wrote:
> > @@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
> >   				return page;
> >   		}
> >   	}
> > -retry:
> > +
> >   	page = __rmqueue_smallest(zone, order, migratetype);
> >   	if (unlikely(!page)) {
> >   		if (alloc_flags & ALLOC_CMA)
> >   			page = __rmqueue_cma_fallback(zone, order);
> > -
> > -		if (!page && __rmqueue_fallback(zone, order, migratetype,
> > -								alloc_flags))
> > -			goto retry;
> > +		else
> > +			page = __rmqueue_fallback(zone, order, migratetype,
> > +						  alloc_flags);
> 
> (Sorry for chim in late.)
> 
> I have some doubts about the changes here. The original code logic was 
> that if the 'migratetype' type allocation is failed, it would first try 
> CMA page allocation and then attempt to fallback to other migratetype 
> allocations. Now it has been changed so that if CMA allocation fails, it 
> will directly return. This change has caused a regression when I running 
> the thpcompact benchmark, resulting in a significant reduction in the 
> percentage of THPs like below:
> 
> thpcompact Percentage Faults Huge
>                           K6.9-rc2                K6.9-rc2 + this patch
> Percentage huge-1        78.18 (   0.00%)       42.49 ( -45.65%)
> Percentage huge-3        86.70 (   0.00%)       35.13 ( -59.49%)
> Percentage huge-5        90.26 (   0.00%)       52.35 ( -42.00%)
> Percentage huge-7        92.34 (   0.00%)       31.84 ( -65.52%)
> Percentage huge-12       91.18 (   0.00%)       45.85 ( -49.72%)
> Percentage huge-18       89.00 (   0.00%)       29.18 ( -67.22%)
> Percentage huge-24       90.52 (   0.00%)       46.68 ( -48.43%)
> Percentage huge-30       94.44 (   0.00%)       38.35 ( -59.39%)
> Percentage huge-32       93.09 (   0.00%)       39.37 ( -57.70%)

Ouch, sorry about that! I changed that specific part around later
during development and didn't retest with CMA. I'll be sure to
re-enable it again in my config.

> After making the following modifications, the regression is gone.
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ce67dc6777fa..a7cfe65e45c1 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order, 
> int migratetype,
>          if (unlikely(!page)) {
>                  if (alloc_flags & ALLOC_CMA)
>                          page = __rmqueue_cma_fallback(zone, order);
> -               else
> +
> +               if (!page)
>                          page = __rmqueue_fallback(zone, order, migratetype,
>                                                    alloc_flags);
>          }
> 
> But I am not sure your original change is intentional? IIUC, we still 
> need try fallbacking even though CMA allocation is failed, please 
> correct me if I misunderstand your code. Thanks.

No, this was accidental. I missed that CMA dependency when changing
things around for the new return type of __rmqueue_fallback(). Your
fix is good: just because the request qualifies for CMA doesn't mean
it will succeed from that region. We need the fallback for those.

Andrew, could you please pick up Baolin's change for this patch?

[baolin.wang@...ux.alibaba.com: fix allocation failures with CONFIG_CMA]

Thanks for debugging this and the fix, Baolin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ