[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150210204241.GV4166@linux.vnet.ibm.com>
Date: Tue, 10 Feb 2015 12:42:41 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Rik van Riel <riel@...hat.com>, Will Deacon <will.deacon@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Catalin Marinas <Catalin.Marinas@....com>,
Frédéric Weisbecker <fweisbec@...il.com>,
kvm list <kvm@...r.kernel.org>,
Marcelo Tosatti <mtosatti@...hat.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Ingo Molnar <mingo@...nel.org>,
Oleg Nesterov <oleg@...hat.com>,
Luiz Capitulino <lcapitulino@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when
running KVM guest
On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
> >> On 02/10/2015 06:41 AM, riel@...hat.com wrote:
> >> >From: Rik van Riel <riel@...hat.com>
> >> >
> >> >The host kernel is not doing anything while the CPU is executing
> >> >a KVM guest VCPU, so it can be marked as being in an extended
> >> >quiescent state, identical to that used when running user space
> >> >code.
> >> >
> >> >The only exception to that rule is when the host handles an
> >> >interrupt, which is already handled by the irq code, which
> >> >calls rcu_irq_enter and rcu_irq_exit.
> >> >
> >> >The guest_enter and guest_exit functions already switch vtime
> >> >accounting independent of context tracking. Leave those calls
> >> >where they are, instead of moving them into the context tracking
> >> >code.
> >> >
> >> >Signed-off-by: Rik van Riel <riel@...hat.com>
> >> >---
> >> > include/linux/context_tracking.h | 6 ++++++
> >> > include/linux/context_tracking_state.h | 1 +
> >> > include/linux/kvm_host.h | 3 ++-
> >> > 3 files changed, 9 insertions(+), 1 deletion(-)
> >> >
> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >> >index 954253283709..b65fd1420e53 100644
> >> >--- a/include/linux/context_tracking.h
> >> >+++ b/include/linux/context_tracking.h
> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
> >> > vtime_guest_enter(current);
> >> > else
> >> > current->flags |= PF_VCPU;
> >> >+
> >> >+ if (context_tracking_is_enabled())
> >> >+ context_tracking_enter(IN_GUEST);
> >>
> >> Why the if statement?
> >>
> >> Also, have you checked how much this hurts guest lightweight
> >> entry/exit latency? Context tracking is shockingly expensive for
> >> reasons I don't fully understand, but hopefully most of it is the
> >> vtime stuff. (Context tracking is *so* expensive that I almost
> >> think we should set the performance taint flag if we enable it,
> >> assuming that flag ended up getting merged. Also, we should make
> >> context tracking faster.)
> >
> > It turns out that context_tracking_is_enabled() is a static inline
> > that uses a static_key, so the overhead should be minimal on platforms
> > having a full implementation of static keys.
>
> Shouldn't we just fold that into context_tracking_xyz_enter?
If I am not getting too confused, Rik did that initially, but it caused
some pain for the ARM guys. I don't see a performance downside, at
least not for a modern compiler that does a decent job of inlining.
> Also, why does the vtime stuff depend on RCU extended quiescent
> states? To me, they seem mostly orthogonal other than the fact that
> they hook into the same places.
I might be missing your point, but...
If there are no scheduling-clock interrupts, then the CPU needs to be
in an extended quiescent state, otherwise you will get RCU CPU stall
warnings and eventually OOM. Similarly, if there are no scheduling-clock
interupts, then you need to compute the vtime stuff based on start times
and deltas instead of relying on a scheduling-clock interrupt that never
comes. So it isn't that the vtime and RCU stuff are directly related,
but rather that they both must take evasive action if there are to be
no scheduling-clock interrupts for an extended time period.
Therefore, they really need to key off of the same conditions.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists