linux-kernel - Re: [PATCH 2/8] mm/vmscan: Throttle reclaim and compaction when too may pages are isolated

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5e2c8c39-29d9-61be-049f-a408f62f5acf@suse.cz>
Date:   Thu, 14 Oct 2021 10:06:25 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Linux-MM <linux-mm@...ck.org>
Cc:     NeilBrown <neilb@...e.de>, Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        "Darrick J . Wong" <djwong@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Dave Chinner <david@...morbit.com>,
        Rik van Riel <riel@...riel.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/8] mm/vmscan: Throttle reclaim and compaction when too
 may pages are isolated

On 10/8/21 15:53, Mel Gorman wrote:
> Page reclaim throttles on congestion if too many parallel reclaim instances
> have isolated too many pages. This makes no sense, excessive parallelisation
> has nothing to do with writeback or congestion.
> 
> This patch creates an additional workqueue to sleep on when too many
> pages are isolated. The throttled tasks are woken when the number
> of isolated pages is reduced or a timeout occurs. There may be
> some false positive wakeups for GFP_NOIO/GFP_NOFS callers but
> the tasks will throttle again if necessary.
> 
> [shy828301@...il.com: Wake up from compaction context]
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>

...

> diff --git a/mm/internal.h b/mm/internal.h
> index 90764d646e02..06d0c376efcd 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -45,6 +45,15 @@ static inline void acct_reclaim_writeback(struct page *page)
>  		__acct_reclaim_writeback(pgdat, page, nr_throttled);
>  }
>  
> +static inline void wake_throttle_isolated(pg_data_t *pgdat)
> +{
> +	wait_queue_head_t *wqh;
> +
> +	wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_ISOLATED];
> +	if (waitqueue_active(wqh))
> +		wake_up_all(wqh);

Again, would it be better to wake up just one task to prevent possible
thundering herd? We can assume that that task will call too_many_isolated()
eventually to wake up the next one? Although it seems strange that
too_many_isolated() is the place where we detect the situation for wake up.
Simpler than to hook into NR_ISOLATED decrementing I guess.

> +}
> +
>  vm_fault_t do_swap_page(struct vm_fault *vmf);
>  
>  void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
...
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1006,11 +1006,10 @@ static void handle_write_error(struct address_space *mapping,
>  	unlock_page(page);
>  }
>  
> -static void
> -reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
> +void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
>  							long timeout)
>  {
> -	wait_queue_head_t *wqh = &pgdat->reclaim_wait;
> +	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];

It seems weird that later in this function we increase nr_reclaim_throttled
without distinguishing the reason, so effectively throttling for isolated
pages will trigger acct_reclaim_writeback() doing the NR_THROTTLED_WRITTEN
counting, although it's not related at all? Maybe either have separate
nr_reclaim_throttled counters per vmscan_throttle_state (if counter of
isolated is useful, I haven't seen the rest of series yet), or count only
VMSCAN_THROTTLE_WRITEBACK tasks?

>  	long ret;
>  	DEFINE_WAIT(wait);
>  
> @@ -1053,7 +1052,7 @@ void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page,
>  		READ_ONCE(pgdat->nr_reclaim_start);
>  
>  	if (nr_written > SWAP_CLUSTER_MAX * nr_throttled)
> -		wake_up_all(&pgdat->reclaim_wait);
> +		wake_up_all(&pgdat->reclaim_wait[VMSCAN_THROTTLE_WRITEBACK]);
>  }
>  
>  /* possible outcome of pageout() */