linux-kernel - Re: [PATCH v2] KVM: SVM: Set/clear SRSO's BP_SPEC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aBtc7Y8odYFGGLrt@google.com>
Date: Wed, 7 May 2025 06:19:01 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Michael Larabel <Michael@...haellarabel.com>, Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH v2] KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1
 VM count transitions

On Wed, May 07, 2025, Yosry Ahmed wrote:
> On Tue, May 06, 2025 at 08:57:36AM -0700, Sean Christopherson wrote:
> > Sleepable spinlocks aside, the lockdep_assert_irqs_enabled() in
> > smp_call_function_many_cond() already provides sufficient of coverage for that
> > case.  And if code is using some other form of IPI communication *and* taking raw
> > spinlocks, then I think it goes without saying that developers would need to be
> > very, very careful.
> 
> I think we are not talking about the same thing, or I am
> misunderstanding you. The lockdep_assert_irqs_enabled() assertion in
> smp_call_function_many_cond() does not protect against this case AFAICT.
> 
> Basically imagine that a new code path is added that does:
> 
> 	spin_lock_irq(&srso_lock);
> 	/* Some trivial logic, no IPI sending */
> 	spin_unlock_irq(&srso_lock);
> 
> I believe spin_lock_irq() will disable IRQs (at least on some setups)
> then spin on the lock.

Yes, because the most common use case for spin_lock_irq() is to prevent deadlock
due to the lock being taken in IRQ context.

> Now imagine svm_srso_vm_destroy() is already holding the lock and sends
> the IPI from CPU 1, while CPU 2 is executing the above code with IRQs
> already disabled and spinning on the lock.
> 
> This is the deadlock scenario I am talking about. The lockdep assertion
> in smp_call_function_many_cond() doesn't help because IRQs are enabled
> on CPU 1, the problem is that they are disabled on CPU 2.
> 
> Lockdep can detect this by keeping track of the fact that some code
> paths acquire the lock with IRQs off while some code paths acquire the
> lock and send IPIs, I think.

I understand the scenario, I just don't see any meaningful risk in this case,
which in turn means I don't see any reason to use an inferior lock type (for this
particular case) to protect the count.  

spin_lock_irq() isn't a tool that's used willy-nilly, and the usage of srso_lock
is extremely limited.  If we manage to merge code that does spin_lock_irq(&srso_lock),
you have my full permission to mock my ineptitude :-)