linux-kernel - Re: [RFC PATCH 1/1] kvm: Note an RCU quiescent state on guest exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJ6HWG53vjhAKjPAFeyjdbopAWzSJTBDz5t5YY+2B13MUdPYfQ@mail.gmail.com>
Date: Tue, 27 Aug 2024 16:50:36 -0300
From: Leonardo Bras Soares Passos <leobras@...hat.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, Leonardo Bras <leobras.c@...il.com>, 
	Sean Christopherson <seanjc@...gle.com>, Frederic Weisbecker <frederic@...nel.org>, 
	Marcelo Tosatti <mtosatti@...hat.com>, linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] kvm: Note an RCU quiescent state on guest exit

Hi Sean,

Have you had the time to review this?

QE team is hitting this bug a lot, and I am afraid that it will start
to hit customers soon.

Please let me know if you need any further data / assistance.

Thanks!
Leo


On Mon, Jul 29, 2024 at 8:28 AM Leonardo Bras Soares Passos
<leobras@...hat.com> wrote:
>
> On Fri, Jul 12, 2024 at 5:02 PM Leonardo Bras <leobras@...hat.com> wrote:
> >
> > On Fri, Jul 12, 2024 at 05:57:10PM +0200, Paolo Bonzini wrote:
> > > On 7/11/24 01:18, Leonardo Bras wrote:
> > > > What are your thoughts on above results?
> > > > Anything you would suggest changing?
> > >
> >
> > Hello Paolo, thanks for the feedback!
> >
> > > Can you run the test with a conditional on "!tick_nohz_full_cpu(vcpu->cpu)"?
> > >
> > > If your hunch is correct that nohz-full CPUs already avoid invoke_rcu_core()
> > > you might get the best of both worlds.
> > >
> > > tick_nohz_full_cpu() is very fast when there is no nohz-full CPU, because
> > > then it shortcuts on context_tracking_enabled() (which is just a static
> > > key).
> >
> > But that would mean not noting an RCU quiescent state in guest_exit of
> > nohz_full cpus, right?
> >
> > The original issue we were dealing was having invoke_rcu_core() running on
> > nohz_full cpus, and messing up the latency of RT workloads inside the VM.
> >
> > While most of the invoke_rcu_core() get ignored by the nohz_full rule,
> > there are some scenarios in which it the vcpu thread may take more than 1s
> > between a guest_entry and the next one (VM busy), and those which did
> > not get ignored have caused latency peaks in our tests.
> >
> > The main idea of this patch is to note RCU quiescent states on guest_exit
> > at nohz_full cpus (and use rcu.patience) to avoid running invoke_rcu_core()
> > between a guest_exit and the next guest_entry if it takes less than
> > rcu.patience miliseconds between exit and entry, and thus avoiding the
> > latency increase.
> >
> > What I tried to prove above is that it also improves non-Isolated cores as
> > well, since rcu_core will not be running as often, saving cpu cycles that
> > can be used by the VM.
> >
> >
> > What are your thoughts on that?
>
> Hello Paolo, Sean,
> Thanks for the feedback so far!
>
> Do you have any thoughts or suggestions for this patch?
>
> Thanks!
> Leo
>
> >
> > Thanks!
> > Leo