linux-kernel - Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <64265cd2-96b2-2c91-83e5-494187821492@oracle.com>
Date:   Mon, 12 Jun 2023 11:45:05 -0700
From:   Sidhartha Kumar <sidhartha.kumar@...cle.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, songmuchun@...edance.com,
        mike.kravetz@...cle.com, almasrymina@...gle.com,
        linmiaohe@...wei.com, minhquangbui99@...il.com,
        aneesh.kumar@...ux.ibm.com
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page
 to folios

On 6/12/23 10:41 AM, Matthew Wilcox wrote:
> On Tue, Nov 01, 2022 at 03:30:55PM -0700, Sidhartha Kumar wrote:
>> +++ b/mm/hugetlb.c
>> @@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>   int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
>>   {
>>   	struct hstate *h;
>> -	struct page *head;
>> +	struct folio *folio = page_folio(page);
> 
> Is this safe?  I was reviewing a different patch today, and I spotted
> this.  With THP, we can relatively easily hit this case:
> 
> struct page points to a page with pfn 0x40305, in a folio of order 2.
> We call page_folio() on it and the resulting pointer is for the folio
> with pfn 0x40304.
> If we don't have our own refcount (or some other protection ...) against
> freeing, the folio can now be freed and reallocated.  Say it's now part
> of an order-3 folio.
> Our 'folio' pointer is now actually a pointer to a tail page, and we
> have various assertions that a folio pointer doesn't point to a tail
> page, so they trigger.
> 
> It seems to me that this ...
> 
>          /*
>           * The page might have been dissolved from under our feet, so make sure
>           * to carefully check the state under the lock.
>           * Return success when racing as if we dissolved the page ourselves.
>           */
>          spin_lock_irq(&hugetlb_lock);
>          if (folio_test_hugetlb(folio)) {
>                  h = folio_hstate(folio);
>          } else {
>                  spin_unlock_irq(&hugetlb_lock);
>                  return 0;
>          }
> 
> implies that we don't have our own reference on the folio, so we might
> find a situation where the folio pointer we have is no longer a folio
> pointer.
> 

If the folio became free and reallocated would this be considered a 
success? If the folio is no longer a hugetlb folio, 
isolate_or_dissolve_huge_page() returns as if it dissolved the page itself.

Later in the call stack, within alloc_and_dissolve_hugetlb_folio() there is

	if (!folio_test_hugetlb(old_folio)) {
		/*
		 * Freed from under us. Drop new_folio too.
		 */
		goto free_new;
	}

which would imply it is safe for the old_folio to have been dropped/freed.


> Maybe the page_folio() call should be moved inside the hugetlb_lock
> protection?  Is that enough?  I don't know enough about how hugetlb
> pages are split, freed & allocated to know what's going on.
> But then we _drop_ the lock, and keep referring to ...
> 
>> @@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
>>   	if (hstate_is_gigantic(h))
>>   		return -ENOMEM;
>>   
>> -	if (page_count(head) && !isolate_hugetlb(head, list))
>> +	if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
>>   		ret = 0;
>> -	else if (!page_count(head))
>> -		ret = alloc_and_dissolve_huge_page(h, head, list);
>> +	else if (!folio_ref_count(folio))
>> +		ret = alloc_and_dissolve_huge_page(h, &folio->page, list);
> 
> And I fall back to saying "I don't know enough to know if this is safe".