lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABgObfa6DGM4X5HKb6FHoeiitTp4ddVaUGLRqK+Z=a-JLc9Bzw@mail.gmail.com>
Date: Tue, 11 Feb 2025 11:40:29 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Marc Zyngier <maz@...nel.org>
Cc: Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org, 
	Jing Zhang <jingzhangos@...gle.com>, Oliver Upton <oliver.upton@...ux.dev>, 
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, 
	Randy Dunlap <rdunlap@...radead.org>, Suzuki K Poulose <suzuki.poulose@....com>, 
	Palmer Dabbelt <palmer@...belt.com>, Zenghui Yu <yuzenghui@...wei.com>, kvm-riscv@...ts.infradead.org, 
	Ingo Molnar <mingo@...hat.com>, linux-riscv@...ts.infradead.org, 
	Joey Gouly <joey.gouly@....com>, Paul Walmsley <paul.walmsley@...ive.com>, 
	Thomas Gleixner <tglx@...utronix.de>, Bjorn Helgaas <bhelgaas@...gle.com>, 
	Albert Ou <aou@...s.berkeley.edu>, kvmarm@...ts.linux.dev, 
	Alexander Potapenko <glider@...gle.com>, x86@...nel.org, Sean Christopherson <seanjc@...gle.com>, 
	Anup Patel <anup@...infault.org>, Kunkun Jiang <jiangkunkun@...wei.com>, 
	Atish Patra <atishp@...shpatra.org>, Catalin Marinas <catalin.marinas@....com>, 
	Will Deacon <will@...nel.org>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 2/3] KVM: arm64: switch to using kvm_lock/unlock_all_vcpus

On Tue, Feb 11, 2025 at 10:25 AM Marc Zyngier <maz@...nel.org> wrote:
> > No functional change intended.
>
> Actually plenty of it.

Yes, definitely. That's not "no functional change intended", it's "no
breakage of sane userspace intended" which is pretty much the
opposite.

> Yet the new helper returns -EINTR, which you blindly forward to userspace.

It won't quite reach userspace, since the task won't survive
mutex_lock_killable(). With your current code the kill will arrive
just after the ioctl returns to userspace, so there's no change in
behavior there. The -EBUSY change is the one that matters.

> At the end of the day, the x86 locking serves completely different
> purposes. It wants to gracefully wait for vcpus to exit and is happy
> to replay things, because migration (which is what x86 seems to be
> using this for) is a stupidly long process.

No, it's not using it for the whole length of migration. It serves the
same exact purpose as ARM: do some stuff on something that spans a
whole struct kvm. The only difference is the behavior on contended
mutex, where x86 simply says don't do it.

> Our locking is designed to
> either succeed or fail quickly, because some of the lock paths are on
> the critical path for VM startup and configuration.

The only long-running vCPU ioctl is KVM_RUN and you don't have that
during VM startup and configuration. The interesting case is
snapshotting, i.e. reading attributes.

Failing quickly when reading attributes makes sense, and I'd be
rightly wary of changing it. I know that QEMU is doing them only when
it knows CPUs are stopped, but it does seem to affect crosvm and
Firecracker more, for snapshotting more than startup/configuration.
The GIC snapshotting code in crosvm returns an anyhow::Result and ends
up logging an error with println!, Firecracker is similar as it also
propagates the error. So neither crosvm nor Firecracker have
particularly sophisticated error handling but they do seem to want to
fail their RPCs quickly.

Failing quickly when creating the device or setting attributes makes
less sense (Firecracker for one does an unwrap() there). But anyhow
the change would have to be a separate patch, certainly not one
applied through the generic KVM tree; and if you decide that you're
stuck with it, that would be more than understandable.

> So for this series to be acceptable, you'd have to provide the same
> semantics. It is probably doable with a bit of macro magic, at the
> expense of readability.

Or it can be just an extra bool argument, plus wrappers.

For RISC-V it's only used at device creation time, and the ioctl fails
if the vCPU ran at least once. Removing the "try" in trylock is
totally feasible there, however it must be documented in the commit
message.

Thanks,

Paolo

> What I would also like to see is for this primitive to be usable with
> scoped_cond_guard(), which would make the code much more readable.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ