linux-kernel - Re: [PATCH 2/2] KVM: Protect vCPU's "last run PID" with rwlock, not RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zrq_hd5CkmmOl6jp@google.com>
Date: Mon, 12 Aug 2024 19:05:57 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Oliver Upton <oliver.upton@...ux.dev>
Cc: Marc Zyngier <maz@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>, 
	linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev, 
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Steve Rutherford <srutherford@...gle.com>
Subject: Re: [PATCH 2/2] KVM: Protect vCPU's "last run PID" with rwlock, not RCU

On Fri, Aug 09, 2024, Oliver Upton wrote:
> On Tue, Aug 06, 2024 at 04:59:03PM -0700, Sean Christopherson wrote:
> > > Can you nest this lock inside of the vcpu->mutex acquisition in
> > > kvm_vm_ioctl_create_vcpu() so lockdep gets the picture?
> > 
> > I don't think that's necessary.  Commit 42a90008f890 ("KVM: Ensure lockdep knows
> > about kvm->lock vs. vcpu->mutex ordering rule") added the lock+unlock in
> > kvm_vm_ioctl_create_vcpu() purely because actually taking vcpu->mutex inside
> > kvm->lock is rare, i.e. lockdep would be unable to detect issues except for very
> > specific VM types hitting very specific flows.
> 
> I don't think the perceived rarity matters at all w/ this.

Rarity definitely matters.  If KVM was splattered with code that takes vcpu->mutex
inside kvm->lock, then the mess that led to above commit likely would never had
happened.

> Beyond the lockdep benefits, it is a self-documenting way to describe lock
> ordering.

Lock acquisition alone won't suffice, many of the more unique locks in KVM need
comments/documentation, e.g. to explain additional rules, assumptions that make
things work, etc.  We could obviously add comments for everything, but I don't
see how that's clearly better than actual documentation.  E.g. pid_lock is taken
for read across vCPUs.  Acquiring vcpu->pid_lock inside vcpu->mutex doesn't
capture that at all.

It's also simply not realistic to enumerate every possible combination.  Many of
the combinations will likely never happen in practice, especially for spinlocks
since their usage is quite targeted.  Trying to document the "preferred" ordering
between the various spinlocks would be an exercise in futility as so many would
be 100% arbitrary due to lack of a use case.

KVM's mutexes are more interesting because they tend to be coarser, and thus are
more prone to nesting, so maybe we could have lockdep-enforced documentation for
those?  But if we want to do that, I think we should have a dedicated helper (and
possible arch hooks), not an ad hoc pile of locks in vCPU creation.

And we should have that discussion over here[*], because I was planning on posting
a patch to revert the lockdep-only lock "documentation".

[*] https://lore.kernel.org/all/ZrFYsSPaDWUHOl0N@google.com

> Dunno about you, but I haven't kept up with locking.rst at all :)

Heh, x86 has done a decent job of documenting its lock usage.  I would much rather
add an entry in locking.rst for this new lock than add a lock+unlock in vCPU
creation.  Especially since the usage is rather uncommon (guaranteed single writer,
readers are best-effort and cross-vCPU).

> Having said that, an inversion would still be *very* obvious, as it
> would be trying to grab a mutex while holding a spinlock...
> 
> -- 
> Thanks,
> Oliver