linux-kernel - Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f34ea327-bfb1-4014-8967-1429c2133e2d@paulmck-laptop>
Date: Mon, 8 Apr 2024 16:20:13 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Marcelo Tosatti <mtosatti@...hat.com>,
	Leonardo Bras <leobras@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Frederic Weisbecker <frederic@...nel.org>,
	Neeraj Upadhyay <quic_neeraju@...cinc.com>,
	Joel Fernandes <joel@...lfernandes.org>,
	Josh Triplett <josh@...htriplett.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang1211@...il.com>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org
Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

On Mon, Apr 08, 2024 at 04:06:22PM -0700, Sean Christopherson wrote:
> On Mon, Apr 08, 2024, Paul E. McKenney wrote:
> > On Mon, Apr 08, 2024 at 02:56:29PM -0700, Sean Christopherson wrote:
> > > > OK, then we can have difficulties with long-running interrupts hitting
> > > > this range of code.  It is unfortunately not unheard-of for interrupts
> > > > plus trailing softirqs to run for tens of seconds, even minutes.
> > > 
> > > Ah, and if that occurs, *and* KVM is slow to re-enter the guest, then there will
> > > be a massive lag before the CPU gets back into a quiescent state.
> > 
> > Exactly!
> 
> ...
> 
> > OK, then is it possible to get some other indication to the
> > rcu_sched_clock_irq() function that it has interrupted a guest OS?
> 
> It's certainly possible, but I don't think we want to go down that road.
> 
> Any functionality built on that would be strictly limited to Intel CPUs, because
> AFAIK, only Intel VMX has the mode where an IRQ can be handled without enabling
> IRQs (which sounds stupid when I write it like that).
> 
> E.g. on AMD SVM, if an IRQ interrupts the guest, KVM literally handles it by
> doing:
> 
> 	local_irq_enable();
> 	++vcpu->stat.exits;
> 	local_irq_disable();
> 
> which means there's no way for KVM to guarantee that the IRQ that leads to
> rcu_sched_clock_irq() is the _only_ IRQ that is taken (or that what RCU sees was
> even the IRQ that interrupted the guest, though that probably doesn't matter much).
> 
> Orthogonal to RCU, I do think it makes sense to have KVM VMX handle IRQs in its
> fastpath for VM-Exit, i.e. handle the IRQ VM-Exit and re-enter the guest without
> ever enabling IRQs.  But that's purely a KVM optimization, e.g. to avoid useless
> work when the host has already done what it needed to do.
> 
> But even then, to make it so RCU could safely skip invoke_rcu_core(), KVM would
> need to _guarantee_ re-entry to the guest, and I don't think we want to do that.
> E.g. if there is some work that needs to be done on the CPU, re-entering the guest
> is a huge waste of cycles, as KVM would need to do some shenanigans to immediately
> force a VM-Exit.  It'd also require a moderate amount of complexity that I wouldn't
> want to maintain, particularly since it'd be Intel-only.

Thank you for the analysis!

It sounds like the current state, imperfect though it might be, is the
best of the known possible worlds at the moment.

But should anyone come up with something better, please do not keep it
a secret!

							Thanx, Paul

> > Not an emergency, and maybe not even necessary, but it might well be
> > one hole that would be good to stop up.
> > 
> > 							Thanx, Paul