linux-kernel - Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZhR4bnFLA08YgAgr@google.com>
Date: Mon, 8 Apr 2024 16:06:22 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Marcelo Tosatti <mtosatti@...hat.com>, Leonardo Bras <leobras@...hat.com>, 
	Paolo Bonzini <pbonzini@...hat.com>, Frederic Weisbecker <frederic@...nel.org>, 
	Neeraj Upadhyay <quic_neeraju@...cinc.com>, Joel Fernandes <joel@...lfernandes.org>, 
	Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, 
	Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org
Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

On Mon, Apr 08, 2024, Paul E. McKenney wrote:
> On Mon, Apr 08, 2024 at 02:56:29PM -0700, Sean Christopherson wrote:
> > > OK, then we can have difficulties with long-running interrupts hitting
> > > this range of code.  It is unfortunately not unheard-of for interrupts
> > > plus trailing softirqs to run for tens of seconds, even minutes.
> > 
> > Ah, and if that occurs, *and* KVM is slow to re-enter the guest, then there will
> > be a massive lag before the CPU gets back into a quiescent state.
> 
> Exactly!

..

> OK, then is it possible to get some other indication to the
> rcu_sched_clock_irq() function that it has interrupted a guest OS?

It's certainly possible, but I don't think we want to go down that road.

Any functionality built on that would be strictly limited to Intel CPUs, because
AFAIK, only Intel VMX has the mode where an IRQ can be handled without enabling
IRQs (which sounds stupid when I write it like that).

E.g. on AMD SVM, if an IRQ interrupts the guest, KVM literally handles it by
doing:

	local_irq_enable();
	++vcpu->stat.exits;
	local_irq_disable();

which means there's no way for KVM to guarantee that the IRQ that leads to
rcu_sched_clock_irq() is the _only_ IRQ that is taken (or that what RCU sees was
even the IRQ that interrupted the guest, though that probably doesn't matter much).

Orthogonal to RCU, I do think it makes sense to have KVM VMX handle IRQs in its
fastpath for VM-Exit, i.e. handle the IRQ VM-Exit and re-enter the guest without
ever enabling IRQs.  But that's purely a KVM optimization, e.g. to avoid useless
work when the host has already done what it needed to do.

But even then, to make it so RCU could safely skip invoke_rcu_core(), KVM would
need to _guarantee_ re-entry to the guest, and I don't think we want to do that.
E.g. if there is some work that needs to be done on the CPU, re-entering the guest
is a huge waste of cycles, as KVM would need to do some shenanigans to immediately
force a VM-Exit.  It'd also require a moderate amount of complexity that I wouldn't
want to maintain, particularly since it'd be Intel-only.

> Not an emergency, and maybe not even necessary, but it might well be
> one hole that would be good to stop up.
> 
> 							Thanx, Paul