linux-kernel - Re: [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <653a4881-d17b-6d8e-8066-300c0497a702@redhat.com>
Date:   Mon, 22 May 2023 14:01:32 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Huang Ying <ying.huang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Hugh Dickins <hughd@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Minchan Kim <minchan@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Yang Shi <shy828301@...il.com>, Yu Zhao <yuzhao@...gle.com>
Subject: Re: [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge
 get/put_swap_device protection range

On 22.05.23 09:09, Huang Ying wrote:
> This makes the function a little easier to be understood because we
> don't need to consider swapoff.  And this makes it possible to remove
> get/put_swap_device() calling in some functions called by
> __read_swap_cache_async().
> 
> Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Hugh Dickins <hughd@...gle.com>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Cc: Matthew Wilcox <willy@...radead.org>
> Cc: Michal Hocko <mhocko@...e.com>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: Tim Chen <tim.c.chen@...ux.intel.com>
> Cc: Yang Shi <shy828301@...il.com>
> Cc: Yu Zhao <yuzhao@...gle.com>
> ---
>   mm/swap_state.c | 33 ++++++++++++++++++++++-----------
>   1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index b76a65ac28b3..a1028fe7214e 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -417,9 +417,13 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   {
>   	struct swap_info_struct *si;
>   	struct folio *folio;
> +	struct page *page;
>   	void *shadow = NULL;
>   
>   	*new_page_allocated = false;
> +	si = get_swap_device(entry);
> +	if (!si)
> +		return NULL;
>   
>   	for (;;) {
>   		int err;
> @@ -428,14 +432,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   		 * called after swap_cache_get_folio() failed, re-calling
>   		 * that would confuse statistics.
>   		 */
> -		si = get_swap_device(entry);
> -		if (!si)
> -			return NULL;
>   		folio = filemap_get_folio(swap_address_space(entry),
>   						swp_offset(entry));
> -		put_swap_device(si);
> -		if (!IS_ERR(folio))
> -			return folio_file_page(folio, swp_offset(entry));
> +		if (!IS_ERR(folio)) {
> +			page = folio_file_page(folio, swp_offset(entry));
> +			goto got_page;
> +		}
>   
>   		/*
>   		 * Just skip read ahead for unused swap slot.
> @@ -445,8 +447,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   		 * as SWAP_HAS_CACHE.  That's done in later part of code or
>   		 * else swap_off will be aborted if we return NULL.
>   		 */
> -		if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
> -			return NULL;
> +		if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
> +			goto fail;
>   
>   		/*
>   		 * Get a new page to read into from swap.  Allocate it now,
> @@ -455,7 +457,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   		 */
>   		folio = vma_alloc_folio(gfp_mask, 0, vma, addr, false);
>   		if (!folio)
> -			return NULL;
> +                        goto fail;
>   
>   		/*
>   		 * Swap entry may have been freed since our caller observed it.
> @@ -466,7 +468,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   
>   		folio_put(folio);
>   		if (err != -EEXIST)
> -			return NULL;
> +			goto fail;
>   
>   		/*
>   		 * We might race against __delete_from_swap_cache(), and
> @@ -500,12 +502,17 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>   	/* Caller will initiate read into locked folio */
>   	folio_add_lru(folio);
>   	*new_page_allocated = true;
> -	return &folio->page;
> +	page = &folio->page;
> +got_page:
> +	put_swap_device(si);
> +	return page;
>   
>   fail_unlock:
>   	put_swap_folio(folio, entry);
>   	folio_unlock(folio);
>   	folio_put(folio);
> +fail:

Maybe better "fail_put_swap".

We now hold the swap device ref longer than we used to, prevent swapoff 
over the whole operation. I guess there is no way we can deadlock this way?

In general, looks good to me.

-- 
Thanks,

David / dhildenb