[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZoKuLzyhE8N4RaW4@redhat.com>
Date: Mon, 1 Jul 2024 14:25:03 +0100
From: Daniel P. Berrangé <berrange@...hat.com>
To: Michal Prívozník <mprivozn@...hat.com>
Cc: Mikulas Patocka <mpatocka@...hat.com>, Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Waiman Long <longman@...hat.com>, Mike Snitzer <snitzer@...nel.org>,
Laurence Oberman <loberman@...hat.com>,
Jonathan Brassow <jbrassow@...hat.com>,
Ming Lei <minlei@...hat.com>, Ondrej Kozina <okozina@...hat.com>,
Milan Broz <gmazyland@...il.com>, linux-kernel@...r.kernel.org,
dm-devel@...ts.linux.dev, users@...ts.libvirt.org
Subject: Re: dm-crypt performance regression due to workqueue changes
On Mon, Jul 01, 2024 at 02:48:07PM +0200, Michal Prívozník wrote:
> On 6/30/24 20:49, Mikulas Patocka wrote:
> >
> >
> > On Sun, 30 Jun 2024, Tejun Heo wrote:
> >
> >> Hello,
> >>
> >> On Sat, Jun 29, 2024 at 08:15:56PM +0200, Mikulas Patocka wrote:
> >>
> >>> With 6.5, we get 3600MiB/s; with 6.6 we get 1400MiB/s.
> >>>
> >>> The reason is that virt-manager by default sets up a topology where we
> >>> have 16 sockets, 1 core per socket, 1 thread per core. And that workqueue
> >>> patch avoids moving work items across sockets, so it processes all
> >>> encryption work only on one virtual CPU.
> >>>
> >>> The performance degradation may be fixed with "echo 'system'
> >>>> /sys/module/workqueue/parameters/default_affinity_scope" - but it is
> >>> regression anyway, as many users don't know about this option.
> >>>
> >>> How should we fix it? There are several options:
> >>> 1. revert back to 'numa' affinity
> >>> 2. revert to 'numa' affinity only if we are in a virtual machine
> >>> 3. hack dm-crypt to set the 'numa' affinity for the affected workqueues
> >>> 4. any other solution?
> >>
> >> Do you happen to know why libvirt is doing that? There are many other
> >> implications to configuring the system that way and I don't think we want to
> >> design kernel behaviors to suit topology information fed to VMs which can be
> >> arbitrary.
>
> Firstly, libvirt's not doing anything. It very specifically avoids doing
> policy decisions. If something configures vCPUs so that they are in
> separate sockets, then we should look at that something. Alternatively,
> if "default" configuration does not work for your workflow well,
> document recommended configuration.
Actually in this particular case, it is strictly speaking libvirt.
If the guest XML config does not mention any <topology> info, then
libvirt explicitly tells QEMU to set sockets=N,cores=1,threads=1.
That matches QEMU's own historical built-in default topology.
None the less, my advice for mgmt applications using libvirt would
likely be to explicitly request sockets=1,cores=N,threads=1. This
is because it gives slightly better compatibility with unpleasant
software that applies licensing / subscription rules that penalize
use of many sockets, while being happy with any number of cores.
Either way though, the topology is a lie when the guest CPUs
are not pinned to host CPUs, so making performance decisions based
on this is unlikely to yield the desired results. Historically the
cores vs sockets distinction hasn't seemed to make much difference
to guest OS performance, as the OS' haven't made significant
decisions on this axis. Exposing threads != 1 though has always been
a big no though, unless strictly pinning 1:1 guest:host CPUs, as that
has had notable impacts on scheduling decisions.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Powered by blists - more mailing lists