lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 31 May 2022 07:52:14 +0530
From:   Anshuman Khandual <anshuman.khandual@....com>
To:     Zi Yan <ziy@...dia.com>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC] mm/page_isolation: Fix an infinite loop in
 isolate_single_pageblock()



On 5/30/22 19:23, Zi Yan wrote:
> On 30 May 2022, at 7:50, Anshuman Khandual wrote:
> 
>> HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
>> is getting stuck in isolate_single_pageblock(), because of an infinite loop
>> Because head_pfn always evaluate the same, so does pfn, and the outer loop
>> never exits. Dropping the relevant code block, which seems redundant, makes
>> the problem go away.
> 
> Thanks for the report.
> 
>>
>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>> Cc: Zi Yan <ziy@...dia.com>
>> Cc: linux-mm@...ck.org
>> Cc: linux-kernel@...r.kernel.org
>> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@....com>
>> ---
>> I am not sure about this fix, and also did not find much time today to
>> debug any further. There are much code changes around this function in
>> recent days. This problem is present on latest mainline kernel.
>>
>> - Anshuman
>>
>>  mm/page_isolation.c | 4 ----
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 6021f8444b5a..b0922fee75c1 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>  			struct page *head = compound_head(page);
>>  			unsigned long head_pfn = page_to_pfn(head);
>>
>> -			if (head_pfn + nr_pages <= boundary_pfn) {
>> -				pfn = head_pfn + nr_pages;
>> -				continue;
>> -			}
>>  #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>  			/*
>>  			 * hugetlb, lru compound (THP), and movable compound pages
>> -- 
>> 2.20.1
> 
> Can you try the patch below to see if it fixes the issue? Thanks.
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 6021f8444b5a..d200d41ad0d3 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -385,9 +385,9 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>                  * above do the rest. If migration is not possible, just fail.
>                  */
>                 if (PageCompound(page)) {
> -                       unsigned long nr_pages = compound_nr(page);
>                         struct page *head = compound_head(page);
>                         unsigned long head_pfn = page_to_pfn(head);
> +                       unsigned long nr_pages = compound_nr(head);
> 
>                         if (head_pfn + nr_pages <= boundary_pfn) {
>                                 pfn = head_pfn + nr_pages;
> 
> 

Yes, this does solve the problem. I guess nr_pages should have been derived
from the compound head itself for it be meaningful (i.e > 1). I assume you
will send a fix patch with appropriate write up that describes this problem.

- Anshuman

Powered by blists - more mailing lists