[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWKbbY+34ZXYQB5e5k+hZfKFV4TqXy_cV3iY+58PajxUQ@mail.gmail.com>
Date: Tue, 10 Feb 2015 13:00:35 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Paul McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Rik van Riel <riel@...hat.com>, Will Deacon <will.deacon@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Catalin Marinas <Catalin.Marinas@....com>,
Frédéric Weisbecker <fweisbec@...il.com>,
kvm list <kvm@...r.kernel.org>,
Marcelo Tosatti <mtosatti@...hat.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Ingo Molnar <mingo@...nel.org>,
Oleg Nesterov <oleg@...hat.com>,
Luiz Capitulino <lcapitulino@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when
running KVM guest
On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
>> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
>> <paulmck@...ux.vnet.ibm.com> wrote:
>> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
>> >> On 02/10/2015 06:41 AM, riel@...hat.com wrote:
>> >> >From: Rik van Riel <riel@...hat.com>
>> >> >
>> >> >The host kernel is not doing anything while the CPU is executing
>> >> >a KVM guest VCPU, so it can be marked as being in an extended
>> >> >quiescent state, identical to that used when running user space
>> >> >code.
>> >> >
>> >> >The only exception to that rule is when the host handles an
>> >> >interrupt, which is already handled by the irq code, which
>> >> >calls rcu_irq_enter and rcu_irq_exit.
>> >> >
>> >> >The guest_enter and guest_exit functions already switch vtime
>> >> >accounting independent of context tracking. Leave those calls
>> >> >where they are, instead of moving them into the context tracking
>> >> >code.
>> >> >
>> >> >Signed-off-by: Rik van Riel <riel@...hat.com>
>> >> >---
>> >> > include/linux/context_tracking.h | 6 ++++++
>> >> > include/linux/context_tracking_state.h | 1 +
>> >> > include/linux/kvm_host.h | 3 ++-
>> >> > 3 files changed, 9 insertions(+), 1 deletion(-)
>> >> >
>> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> >> >index 954253283709..b65fd1420e53 100644
>> >> >--- a/include/linux/context_tracking.h
>> >> >+++ b/include/linux/context_tracking.h
>> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
>> >> > vtime_guest_enter(current);
>> >> > else
>> >> > current->flags |= PF_VCPU;
>> >> >+
>> >> >+ if (context_tracking_is_enabled())
>> >> >+ context_tracking_enter(IN_GUEST);
>> >>
>> >> Why the if statement?
>> >>
>> >> Also, have you checked how much this hurts guest lightweight
>> >> entry/exit latency? Context tracking is shockingly expensive for
>> >> reasons I don't fully understand, but hopefully most of it is the
>> >> vtime stuff. (Context tracking is *so* expensive that I almost
>> >> think we should set the performance taint flag if we enable it,
>> >> assuming that flag ended up getting merged. Also, we should make
>> >> context tracking faster.)
>> >
>> > It turns out that context_tracking_is_enabled() is a static inline
>> > that uses a static_key, so the overhead should be minimal on platforms
>> > having a full implementation of static keys.
>>
>> Shouldn't we just fold that into context_tracking_xyz_enter?
>
> If I am not getting too confused, Rik did that initially, but it caused
> some pain for the ARM guys. I don't see a performance downside, at
> least not for a modern compiler that does a decent job of inlining.
It's more of a tidiness issue to me than a performance issue.
>
>> Also, why does the vtime stuff depend on RCU extended quiescent
>> states? To me, they seem mostly orthogonal other than the fact that
>> they hook into the same places.
>
> I might be missing your point, but...
>
> If there are no scheduling-clock interrupts, then the CPU needs to be
> in an extended quiescent state, otherwise you will get RCU CPU stall
> warnings and eventually OOM. Similarly, if there are no scheduling-clock
> interupts, then you need to compute the vtime stuff based on start times
> and deltas instead of relying on a scheduling-clock interrupt that never
> comes. So it isn't that the vtime and RCU stuff are directly related,
> but rather that they both must take evasive action if there are to be
> no scheduling-clock interrupts for an extended time period.
I'm probably missing something, but isn't vtime also used for accurate
CPU time stats?
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists