linux-kernel - Re: [PATCH 3/4] mm/shmem, swap: improve mthp swapin process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7e680582-ac35-3d2d-8945-c26410ff4f9b@huaweicloud.com>
Date: Wed, 18 Jun 2025 16:26:42 +0800
From: Kemeng Shi <shikemeng@...weicloud.com>
To: Kairui Song <kasong@...cent.com>, linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>, Hugh Dickins
 <hughd@...gle.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
 Matthew Wilcox <willy@...radead.org>, Chris Li <chrisl@...nel.org>,
 Nhat Pham <nphamcs@...il.com>, Baoquan He <bhe@...hat.com>,
 Barry Song <baohua@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] mm/shmem, swap: improve mthp swapin process



on 6/18/2025 2:35 AM, Kairui Song wrote:
> From: Kairui Song <kasong@...cent.com>
> 
> Tidy up the mTHP swapin workflow. There should be no feature change, but
> consolidates the mTHP related check to one place so they are now all
> wrapped by CONFIG_TRANSPARENT_HUGEPAGE, and will be trimmed off by
> compiler if not needed.
> 
> Signed-off-by: Kairui Song <kasong@...cent.com>
> ---
>  mm/shmem.c | 175 ++++++++++++++++++++++++-----------------------------
>  1 file changed, 78 insertions(+), 97 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c

...

Hello, here is another potensial issue if shmem swapin can race with folio
split.

>  alloced:
> +	/*
> +	 * We need to split an existing large entry if swapin brought in a
> +	 * smaller folio due to various of reasons.
> +	 *
> +	 * And worth noting there is a special case: if there is a smaller
> +	 * cached folio that covers @swap, but not @index (it only covers
> +	 * first few sub entries of the large entry, but @index points to
> +	 * later parts), the swap cache lookup will still see this folio,
> +	 * And we need to split the large entry here. Later checks will fail,
> +	 * as it can't satisfy the swap requirement, and we will retry
> +	 * the swapin from beginning.
> +	 */
> +	swap_order = folio_order(folio);
> +	if (order > swap_order) {
> +		error = shmem_split_swap_entry(inode, index, swap, gfp);
> +		if (error)
> +			goto failed_nolock;
> +	}
> +
> +	index = round_down(index, 1 << swap_order);
> +	swap.val = round_down(swap.val, 1 << swap_order);
> +
/* suppose folio is splited */
>  	/* We have to do this with folio locked to prevent races */
>  	folio_lock(folio);
>  	if ((!skip_swapcache && !folio_test_swapcache(folio)) ||
>  	    folio->swap.val != swap.val) {
>  		error = -EEXIST;
> -		goto unlock;
> +		goto failed_unlock;
>  	}
>  	if (!folio_test_uptodate(folio)) {
>  		error = -EIO;
> @@ -2407,8 +2386,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>  			goto failed;
>  	}
>  
> -	error = shmem_add_to_page_cache(folio, mapping,
> -					round_down(index, nr_pages),
> +	error = shmem_add_to_page_cache(folio, mapping, index,
>  					swp_to_radix_entry(swap), gfp);
The actual order swapin is less than swap_order and the swap-in folio
may not cover index from caller.

So we should move the index and swap.val calculation after folio is
locked.