linux-kernel - Re: dm-crypt performance regression due to workqueue changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZoKqYg7TKiozapmW@redhat.com>
Date: Mon, 1 Jul 2024 14:08:50 +0100
From: Daniel P. Berrangé <berrange@...hat.com>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
	Waiman Long <longman@...hat.com>, Mike Snitzer <snitzer@...nel.org>,
	Laurence Oberman <loberman@...hat.com>,
	Jonathan Brassow <jbrassow@...hat.com>,
	Ming Lei <minlei@...hat.com>, Ondrej Kozina <okozina@...hat.com>,
	Milan Broz <gmazyland@...il.com>, linux-kernel@...r.kernel.org,
	dm-devel@...ts.linux.dev, users@...ts.libvirt.org
Subject: Re: dm-crypt performance regression due to workqueue changes

On Sun, Jun 30, 2024 at 08:49:48PM +0200, Mikulas Patocka wrote:
> 
> 
> On Sun, 30 Jun 2024, Tejun Heo wrote:
> 
> > Hello,
> > 
> > On Sat, Jun 29, 2024 at 08:15:56PM +0200, Mikulas Patocka wrote:
> > 
> > > With 6.5, we get 3600MiB/s; with 6.6 we get 1400MiB/s.
> > > 
> > > The reason is that virt-manager by default sets up a topology where we 
> > > have 16 sockets, 1 core per socket, 1 thread per core. And that workqueue 
> > > patch avoids moving work items across sockets, so it processes all 
> > > encryption work only on one virtual CPU.
>
> > > The performance degradation may be fixed with "echo 'system'
> > > >/sys/module/workqueue/parameters/default_affinity_scope" - but it is 
> > > regression anyway, as many users don't know about this option.
> > > 
> > > How should we fix it? There are several options:
> > > 1. revert back to 'numa' affinity
> > > 2. revert to 'numa' affinity only if we are in a virtual machine
> > > 3. hack dm-crypt to set the 'numa' affinity for the affected workqueues
> > > 4. any other solution?
> > 
> > Do you happen to know why libvirt is doing that? There are many other
> > implications to configuring the system that way and I don't think we want to
> > design kernel behaviors to suit topology information fed to VMs which can be
> > arbitrary.
> > 
> > Thanks.
> 
> I don't know why. I added users@...ts.libvirt.org to the CC.
> 
> How should libvirt properly advertise "we have 16 threads that are 
> dynamically scheduled by the host kernel, so the latencies between them 
> are changing and unpredictable"?

NB, libvirt is just control plane, the actual virtual hardware exposed
is implemented across QEMU and the KVM kernel mod. Guest CPU topology
and/or NUMA cost information is the responsibility of QEMU.

When QEMU's virtual CPUs are floating freely across host CPUs there's
no perfect answer. The host admin needs to make a tradeoff in their
configuration

They can optimize for density, by allowing guest CPUs to float freely
and allow CPU overcommit against host CPUs, and the guest CPU topology
is essentially a lie.

They can optimize for predictable performance, by strictly pinning
guest CPUs 1:1 to host CPUs, and minimize CPU overcommit, and have
the guest CPU topology 1:1 match the host CPU topology.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|