[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZpGL1rEHNild9CG5@LeoBras>
Date: Fri, 12 Jul 2024 17:02:30 -0300
From: Leonardo Bras <leobras@...hat.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Leonardo Bras <leobras@...hat.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Leonardo Bras <leobras.c@...il.com>,
Sean Christopherson <seanjc@...gle.com>,
Frederic Weisbecker <frederic@...nel.org>,
Marcelo Tosatti <mtosatti@...hat.com>,
linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] kvm: Note an RCU quiescent state on guest exit
On Fri, Jul 12, 2024 at 05:57:10PM +0200, Paolo Bonzini wrote:
> On 7/11/24 01:18, Leonardo Bras wrote:
> > What are your thoughts on above results?
> > Anything you would suggest changing?
>
Hello Paolo, thanks for the feedback!
> Can you run the test with a conditional on "!tick_nohz_full_cpu(vcpu->cpu)"?
>
> If your hunch is correct that nohz-full CPUs already avoid invoke_rcu_core()
> you might get the best of both worlds.
>
> tick_nohz_full_cpu() is very fast when there is no nohz-full CPU, because
> then it shortcuts on context_tracking_enabled() (which is just a static
> key).
But that would mean not noting an RCU quiescent state in guest_exit of
nohz_full cpus, right?
The original issue we were dealing was having invoke_rcu_core() running on
nohz_full cpus, and messing up the latency of RT workloads inside the VM.
While most of the invoke_rcu_core() get ignored by the nohz_full rule,
there are some scenarios in which it the vcpu thread may take more than 1s
between a guest_entry and the next one (VM busy), and those which did
not get ignored have caused latency peaks in our tests.
The main idea of this patch is to note RCU quiescent states on guest_exit
at nohz_full cpus (and use rcu.patience) to avoid running invoke_rcu_core()
between a guest_exit and the next guest_entry if it takes less than
rcu.patience miliseconds between exit and entry, and thus avoiding the
latency increase.
What I tried to prove above is that it also improves non-Isolated cores as
well, since rcu_core will not be running as often, saving cpu cycles that
can be used by the VM.
What are your thoughts on that?
Thanks!
Leo
Powered by blists - more mailing lists