linux-kernel - Re: [PATCHv2] zsmalloc: allow only one active pool compaction context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230417174131.44de959204814209ef73e53e@linux-foundation.org>
Date:   Mon, 17 Apr 2023 17:41:31 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Sergey Senozhatsky <senozhatsky@...omium.org>
Cc:     Minchan Kim <minchan@...nel.org>,
        Yosry Ahmed <yosryahmed@...gle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCHv2] zsmalloc: allow only one active pool compaction
 context

On Mon, 17 Apr 2023 22:54:20 +0900 Sergey Senozhatsky <senozhatsky@...omium.org> wrote:

> zsmalloc pool can be compacted concurrently by many contexts,
> e.g.
> 
>  cc1 handle_mm_fault()
>       do_anonymous_page()
>        __alloc_pages_slowpath()
>         try_to_free_pages()
>          do_try_to_free_pages(
>           lru_gen_shrink_node()
>            shrink_slab()
>             do_shrink_slab()
>              zs_shrinker_scan()
>               zs_compact()
> 
> This creates unnecessary contention as all those processes
> compete for access to the same classes. A single compaction
> process is enough. Moreover contention that is created by
> multiple compaction processes impact other zsmalloc functions,
> e.g. zs_malloc(), since zsmalloc uses "global" pool->lock to
> synchronize access to pool.
> 
> Introduce pool compaction mutex and permit only one compaction
> context at a time. This reduces overall pool->lock contention.

That isn't what the patch does!  Perhaps an earlier version used a mutex?

> /proc/lock-stat after make -j$((`nproc`+1)) linux kernel for
> &pool->lock#3:
> 
>                 Base           Patched
> ------------------------------------------
> con-bounces     2035730        1540066
> contentions     2343871        1774348
> waittime-min    0.10           0.10
> waittime-max    4004216.24     2745.22
> waittime-total  101334168.29   67865414.91
> waittime-avg    43.23          38.25
> acq-bounces     2895765        2186745
> acquisitions    6247686        5136943
> holdtime-min    0.07           0.07
> holdtime-max    2605507.97     482439.16
> holdtime-total  9998599.59     5107151.01
> holdtime-avg    1.60           0.99
> 
> Test run time:
> Base
> 2775.15user 1709.13system 2:13.82elapsed 3350%CPU
> 
> Patched
> 2608.25user 1439.03system 2:03.63elapsed 3273%CPU
> 
> ...
>
> @@ -2274,6 +2275,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>  	struct size_class *class;
>  	unsigned long pages_freed = 0;
>  
> +	if (atomic_xchg(&pool->compaction_in_progress, 1))
> +		return 0;
> +

A code comment might be appropriate here.

Is the spin_is_contended() test in __zs_compact() still relevant?

And....  single-threading the operation seems a pretty sad way of
addressing a contention issue.  zs_compact() is fairly computationally
expensive - surely a large machine would like to be able to
concurrently run many instances of zs_compact()?