[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANRm+CxXAt7z5H1v_Zpjg44Ka09eWc7gaJ7HRq9USUurjqrG3A@mail.gmail.com>
Date: Tue, 30 Mar 2021 09:33:51 +0800
From: Wanpeng Li <kernellwp@...il.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] KVM: X86: Properly account for guest CPU time when
considering context tracking
On Tue, 30 Mar 2021 at 01:15, Sean Christopherson <seanjc@...gle.com> wrote:
>
> +Thomas
>
> On Mon, Mar 29, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@...cent.com>
> >
> > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > reported that the guest time remains 0 when running a while true
> > loop in the guest.
> >
> > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > tick-based time accouting when the ticks that happen after IRQs are
> > disabled are incorrectly accounted to the host/system time. This is
> > because we exit the guest state too early.
> >
> > vtime-based time accounting is tied to context tracking, keep the
> > guest_exit_irqoff() around vmexit code when both vtime-based time
> > accounting and specific cpu is context tracking mode active.
> > Otherwise, leave guest_exit_irqoff() after handle_exit_irqoff()
> > and explicit IRQ window for tick-based time accouting.
> >
> > Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
> > Cc: Sean Christopherson <seanjc@...gle.com>
> > Signed-off-by: Wanpeng Li <wanpengli@...cent.com>
> > ---
> > arch/x86/kvm/svm/svm.c | 3 ++-
> > arch/x86/kvm/vmx/vmx.c | 3 ++-
> > arch/x86/kvm/x86.c | 2 ++
> > 3 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 58a45bb..55fb5ce 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -3812,7 +3812,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> > * into world and some more.
> > */
> > lockdep_hardirqs_off(CALLER_ADDR0);
> > - guest_exit_irqoff();
> > + if (vtime_accounting_enabled_this_cpu())
> > + guest_exit_irqoff();
> >
> > instrumentation_begin();
> > trace_hardirqs_off_finish();
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 32cf828..85695b3 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> > * into world and some more.
> > */
> > lockdep_hardirqs_off(CALLER_ADDR0);
> > - guest_exit_irqoff();
> > + if (vtime_accounting_enabled_this_cpu())
> > + guest_exit_irqoff();
>
> This looks ok, as CONFIG_CONTEXT_TRACKING and CONFIG_VIRT_CPU_ACCOUNTING_GEN are
> selected by CONFIG_NO_HZ_FULL=y, and can't be enabled independently, e.g. the
> rcu_user_exit() call won't be delayed because it will never be called in the
> !vtime case. But it still feels wrong poking into those details, e.g. it'll
> be weird and/or wrong guest_exit_irqoff() gains stuff that isn't vtime specific.
Could you elaborate what's the meaning of "it'll be weird and/or wrong
guest_exit_irqoff() gains stuff that isn't vtime specific."?
Wanpeng
Powered by blists - more mailing lists