[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.21.2302021052580.21238@file01.intranet.prod.int.rdu2.redhat.com>
Date: Thu, 2 Feb 2023 11:04:24 -0500 (EST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Nathan Huckleberry <nhuck@...gle.com>
cc: Mike Snitzer <snitzer@...nel.org>, linux-kernel@...r.kernel.org,
Eric Biggers <ebiggers@...nel.org>, dm-devel@...hat.com,
Sami Tolvanen <samitolvanen@...gle.com>,
Alasdair Kergon <agk@...hat.com>
Subject: Re: [dm-devel] [PATCH] dm-verity: Remove WQ_UNBOUND.
On Wed, 1 Feb 2023, Nathan Huckleberry wrote:
> Setting WQ_UNBOUND increases scheduler latency on ARM64. This is likely
> due to the asymmetric architecture of ARM64 processors.
>
> I've been unable to reproduce the results that claim WQ_UNBOUND gives a
> performance boost on x86-64.
>
> This flag is causing performance issues for multiple subsystems within
> Android. Notably, the same slowdown exists for decompression with
> EROFS.
>
> | open-prebuilt-camera | WQ_UNBOUND | ~WQ_UNBOUND |
> |-----------------------|------------|---------------|
> | verity wait time (us) | 11746 | 119 (-98%) |
> | erofs wait time (us) | 357805 | 174205 (-51%) |
>
> | sha256 ramdisk random read | WQ_UNBOUND | ~WQ_UNBOUND |
> |----------------------------|-----------=---|-------------|
> | arm64 (accelerated) | bw=42.4MiB/s | bw=212MiB/s |
> | arm64 (generic) | bw=16.5MiB/s | bw=48MiB/s |
> | x86_64 (generic) | bw=233MiB/s | bw=230MiB/s |
>
> Cc: Sami Tolvanen <samitolvanen@...gle.com>
> Cc: Eric Biggers <ebiggers@...nel.org>
> Signed-off-by: Nathan Huckleberry <nhuck@...gle.com>
> ---
> drivers/md/dm-verity-target.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index ccf5b852fbf7..020fd2341025 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -1399,8 +1399,8 @@ static int verity_ctr(struct dm_target *ti, unsigned argc, char **argv)
> goto bad;
> }
>
> - /* WQ_UNBOUND greatly improves performance when running on ramdisk */
> - wq_flags = WQ_MEM_RECLAIM | WQ_UNBOUND;
> + wq_flags = WQ_MEM_RECLAIM;
> +
> /*
> * Using WQ_HIGHPRI improves throughput and completion latency by
> * reducing wait times when reading from a dm-verity device.
Hi
If you remove WQ_UNBOUND, you should also change the last argument of
alloc_workqueue from num_online_cpus() to either 0 or 1. Try both 0 and 1
and tell us which performs better.
Mikulas
Powered by blists - more mailing lists