lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <007E2728-5BFC-40F9-8B8F-25504863D217@nvidia.com>
Date: Fri, 04 Jul 2025 21:55:31 -0400
From: Zi Yan <ziy@...dia.com>
To: Balbir Singh <balbirs@...dia.com>
Cc: linux-mm@...ck.org, akpm@...ux-foundation.org,
 linux-kernel@...r.kernel.org, Karol Herbst <kherbst@...hat.com>,
 Lyude Paul <lyude@...hat.com>, Danilo Krummrich <dakr@...nel.org>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Jérôme Glisse <jglisse@...hat.com>,
 Shuah Khan <shuah@...nel.org>, David Hildenbrand <david@...hat.com>,
 Barry Song <baohua@...nel.org>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
 Ryan Roberts <ryan.roberts@....com>, Matthew Wilcox <willy@...radead.org>,
 Peter Xu <peterx@...hat.com>, Kefeng Wang <wangkefeng.wang@...wei.com>,
 Jane Chu <jane.chu@...cle.com>, Alistair Popple <apopple@...dia.com>,
 Donet Tom <donettom@...ux.ibm.com>
Subject: Re: [v1 resend 08/12] mm/thp: add split during migration support

On 4 Jul 2025, at 20:58, Balbir Singh wrote:

> On 7/4/25 21:24, Zi Yan wrote:
>>
>> s/pages/folio
>>
>
> Thanks, will make the changes
>
>> Why name it isolated if the folio is unmapped? Isolated folios often mean
>> they are removed from LRU lists. isolated here causes confusion.
>>
>
> Ack, will change the name
>
>
>>>   *
>>>   * It calls __split_unmapped_folio() to perform uniform and non-uniform split.
>>>   * It is in charge of checking whether the split is supported or not and
>>> @@ -3800,7 +3799,7 @@ bool uniform_split_supported(struct folio *folio, unsigned int new_order,
>>>   */
>>>  static int __folio_split(struct folio *folio, unsigned int new_order,
>>>  		struct page *split_at, struct page *lock_at,
>>> -		struct list_head *list, bool uniform_split)
>>> +		struct list_head *list, bool uniform_split, bool isolated)
>>>  {
>>>  	struct deferred_split *ds_queue = get_deferred_split_queue(folio);
>>>  	XA_STATE(xas, &folio->mapping->i_pages, folio->index);
>>> @@ -3846,14 +3845,16 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>>>  		 * is taken to serialise against parallel split or collapse
>>>  		 * operations.
>>>  		 */
>>> -		anon_vma = folio_get_anon_vma(folio);
>>> -		if (!anon_vma) {
>>> -			ret = -EBUSY;
>>> -			goto out;
>>> +		if (!isolated) {
>>> +			anon_vma = folio_get_anon_vma(folio);
>>> +			if (!anon_vma) {
>>> +				ret = -EBUSY;
>>> +				goto out;
>>> +			}
>>> +			anon_vma_lock_write(anon_vma);
>>>  		}
>>>  		end = -1;
>>>  		mapping = NULL;
>>> -		anon_vma_lock_write(anon_vma);
>>>  	} else {
>>>  		unsigned int min_order;
>>>  		gfp_t gfp;
>>> @@ -3920,7 +3921,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>>>  		goto out_unlock;
>>>  	}
>>>
>>> -	unmap_folio(folio);
>>> +	if (!isolated)
>>> +		unmap_folio(folio);
>>>
>>>  	/* block interrupt reentry in xa_lock and spinlock */
>>>  	local_irq_disable();
>>> @@ -3973,14 +3975,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>>>
>>>  		ret = __split_unmapped_folio(folio, new_order,
>>>  				split_at, lock_at, list, end, &xas, mapping,
>>> -				uniform_split);
>>> +				uniform_split, isolated);
>>>  	} else {
>>>  		spin_unlock(&ds_queue->split_queue_lock);
>>>  fail:
>>>  		if (mapping)
>>>  			xas_unlock(&xas);
>>>  		local_irq_enable();
>>> -		remap_page(folio, folio_nr_pages(folio), 0);
>>> +		if (!isolated)
>>> +			remap_page(folio, folio_nr_pages(folio), 0);
>>>  		ret = -EAGAIN;
>>>  	}
>>
>> These "isolated" special handlings does not look good, I wonder if there
>> is a way of letting split code handle device private folios more gracefully.
>> It also causes confusions, since why does "isolated/unmapped" folios
>> not need to unmap_page(), remap_page(), or unlock?
>>
>>
>
> There are two reasons for going down the current code path

After thinking more, I think adding isolated/unmapped is not the right
way, since unmapped folio is a very generic concept. If you add it,
one can easily misuse the folio split code by first unmapping a folio
and trying to split it with unmapped = true. I do not think that is
supported and your patch does not prevent that from happening in the future.

You should teach different parts of folio split code path to handle
device private folios properly. Details are below.

>
> 1. if the isolated check is not present, folio_get_anon_vma will fail and cause
>    the split routine to return with -EBUSY

You do something below instead.

if (!anon_vma && !folio_is_device_private(folio)) {
	ret = -EBUSY;
	goto out;
} else if (anon_vma) {
	anon_vma_lock_write(anon_vma);
}

People can know device private folio split needs a special handling.

BTW, why a device private folio can also be anonymous? Does it mean
if a page cache folio is migrated to device private, kernel also
sees it as both device private and file-backed?


> 2. Going through unmap_page(), remap_page() causes a full page table walk, which
>    the migrate_device API has already just done as a part of the migration. The
>    entries under consideration are already migration entries in this case.
>    This is wasteful and in some case unexpected.

unmap_folio() already adds TTU_SPLIT_HUGE_PMD to try to split
PMD mapping, which you did in migrate_vma_split_pages(). You probably
can teach either try_to_migrate() or try_to_unmap() to just split
device private PMD mapping. Or if that is not preferred,
you can simply call split_huge_pmd_address() when unmap_folio()
sees a device private folio.

For remap_page(), you can simply return for device private folios
like it is currently doing for non anonymous folios.


For lru_add_split_folio(), you can skip it if a device private
folio is seen.

Last, for unlock part, why do you need to keep all after-split folios
locked? It should be possible to just keep the to-be-migrated folio
locked and unlock the rest for a later retry. But I could miss something
since I am not familiar with device private migration code.

--
Best Regards,
Yan, Zi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ