[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bad11b8-b257-44de-87a5-cf428eaa9a64@redhat.com>
Date: Sun, 30 Jun 2024 10:35:40 -0400
From: Waiman Long <longman@...hat.com>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
Mike Snitzer <snitzer@...nel.org>, Laurence Oberman <loberman@...hat.com>,
Jonathan Brassow <jbrassow@...hat.com>, Ming Lei <minlei@...hat.com>,
Ondrej Kozina <okozina@...hat.com>, Milan Broz <gmazyland@...il.com>,
linux-kernel@...r.kernel.org, dm-devel@...ts.linux.dev
Subject: Re: dm-crypt performance regression due to workqueue changes
On 6/30/24 05:49, Mikulas Patocka wrote:
>
> On Sat, 29 Jun 2024, Waiman Long wrote:
>
>> On 6/29/24 14:15, Mikulas Patocka wrote:
>>> Hi
>>>
>>> I report that the patch 63c5484e74952f60f5810256bd69814d167b8d22
>>> ("workqueue: Add multiple affinity scopes and interface to select them")
>>> is causing massive dm-crypt slowdown in virtual machines.
>>>
>>> Steps to reproduce:
>>> * Install a system in a virtual machine with 16 virtual CPUs
>>> * Create a scratch file with "dd if=/dev/zero of=Scratch.img bs=1M
>>> count=2048 oflag=direct" - the file should be on a fast NVMe drive
>>> * Attach the scratch file to the virtual machine as /dev/vdb; cache mode
>>> should be 'none'
>>> * cryptsetup --force-password luksFormat /dev/vdb
>>> * cryptsetup luksOpen /dev/vdb cr
>>> * fio --direct=1 --bsrange=128k-128k --runtime=40 --numjobs=1
>>> --ioengine=libaio --iodepth=8 --group_reporting=1
>>> --filename=/dev/mapper/cr --name=job --rw=read
>>>
>>> With 6.5, we get 3600MiB/s; with 6.6 we get 1400MiB/s.
>>>
>>> The reason is that virt-manager by default sets up a topology where we
>>> have 16 sockets, 1 core per socket, 1 thread per core. And that workqueue
>>> patch avoids moving work items across sockets, so it processes all
>>> encryption work only on one virtual CPU.
>>>
>>> The performance degradation may be fixed with "echo 'system'
>>>> /sys/module/workqueue/parameters/default_affinity_scope" - but it is
>>> regression anyway, as many users don't know about this option.
>>>
>>> How should we fix it? There are several options:
>>> 1. revert back to 'numa' affinity
>>> 2. revert to 'numa' affinity only if we are in a virtual machine
>>> 3. hack dm-crypt to set the 'numa' affinity for the affected workqueues
>>> 4. any other solution?
>> Another alternative is to go back to the old "numa" default if the kernel is
>> running under a hypervisor since the cpu configuration information is likely
>> to be incorrect anyway. The current default of "cache" will remain if not
>> under a hypervisor.
>>
>> Cheers,
>> Longman
> Yes. How could we detect that we run under a hypervisor portably? There's
> a flag X86_FEATURE_HYPERVISOR, but it's x86-only.
Right, that will be for x86 only. There is also a kernel boot command
line parameter "workqueue.default_affinity_scope=" that one can use to
set the default. It will be a bit easier to use than changing sysfs
parameter at run time.
Cheers,
Longman
Powered by blists - more mailing lists