lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 30 Jun 2024 11:49:45 +0200 (CEST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Waiman Long <longman@...hat.com>
cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>, 
    Mike Snitzer <snitzer@...nel.org>, Laurence Oberman <loberman@...hat.com>, 
    Jonathan Brassow <jbrassow@...hat.com>, Ming Lei <minlei@...hat.com>, 
    Ondrej Kozina <okozina@...hat.com>, Milan Broz <gmazyland@...il.com>, 
    linux-kernel@...r.kernel.org, dm-devel@...ts.linux.dev
Subject: Re: dm-crypt performance regression due to workqueue changes



On Sat, 29 Jun 2024, Waiman Long wrote:

> On 6/29/24 14:15, Mikulas Patocka wrote:
> > Hi
> >
> > I report that the patch 63c5484e74952f60f5810256bd69814d167b8d22
> > ("workqueue: Add multiple affinity scopes and interface to select them")
> > is causing massive dm-crypt slowdown in virtual machines.
> >
> > Steps to reproduce:
> > * Install a system in a virtual machine with 16 virtual CPUs
> > * Create a scratch file with "dd if=/dev/zero of=Scratch.img bs=1M
> >    count=2048 oflag=direct" - the file should be on a fast NVMe drive
> > * Attach the scratch file to the virtual machine as /dev/vdb; cache mode
> >    should be 'none'
> > * cryptsetup --force-password luksFormat /dev/vdb
> > * cryptsetup luksOpen /dev/vdb cr
> > * fio --direct=1 --bsrange=128k-128k --runtime=40 --numjobs=1
> >    --ioengine=libaio --iodepth=8 --group_reporting=1
> >    --filename=/dev/mapper/cr --name=job --rw=read
> >
> > With 6.5, we get 3600MiB/s; with 6.6 we get 1400MiB/s.
> >
> > The reason is that virt-manager by default sets up a topology where we
> > have 16 sockets, 1 core per socket, 1 thread per core. And that workqueue
> > patch avoids moving work items across sockets, so it processes all
> > encryption work only on one virtual CPU.
> >
> > The performance degradation may be fixed with "echo 'system'
> >> /sys/module/workqueue/parameters/default_affinity_scope" - but it is
> > regression anyway, as many users don't know about this option.
> >
> > How should we fix it? There are several options:
> > 1. revert back to 'numa' affinity
> > 2. revert to 'numa' affinity only if we are in a virtual machine
> > 3. hack dm-crypt to set the 'numa' affinity for the affected workqueues
> > 4. any other solution?
> 
> Another alternative  is to go back to the old "numa" default if the kernel is
> running under a hypervisor since the cpu configuration information is likely
> to be incorrect anyway. The current default of "cache" will remain if not
> under a hypervisor.
> 
> Cheers,
> Longman

Yes. How could we detect that we run under a hypervisor portably? There's 
a flag X86_FEATURE_HYPERVISOR, but it's x86-only.

Mikulas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ