lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABCjUKB-pzcY-XFzpBQ6mRi-LiPJ7exAwr+RQXR-pD+P0cxrYA@mail.gmail.com>
Date: Wed, 8 Jan 2025 13:05:29 +0900
From: Suleiman Souhlal <suleiman@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H. Peter Anvin" <hpa@...or.com>, Chao Gao <chao.gao@...el.com>, 
	David Woodhouse <dwmw2@...radead.org>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	ssouhlal@...ebsd.org
Subject: Re: [PATCH v3 2/3] KVM: x86: Include host suspended time in steal time.

On Wed, Jan 8, 2025 at 12:37 AM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Tue, Jan 07, 2025, Suleiman Souhlal wrote:
> > When the host resumes from a suspend, the guest thinks any task
> > that was running during the suspend ran for a long time, even though
> > the effective run time was much shorter, which can end up having
> > negative effects with scheduling. This can be particularly noticeable
> > if the guest task was RT, as it can end up getting throttled for a
> > long time.
> >
> > To mitigate this issue, we include the time that the host was
>
> No "we".
>
> > suspended in steal time, which lets the guest can subtract the
> > duration from the tasks' runtime.
> >
> > Note that the case of a suspend happening during a VM migration
> > might not be accounted.
>
> And this isn't considered a bug because?  I asked for documentation, not a
> statement of fact.

I guess I don't really understand what the difference between
documentation and statements of fact is.

It's not completely clear to me what the desired behavior would be
when suspending during a VM migration.
If we wanted to inject the suspend duration that happened after the
migration started, but before it ended, I suppose we would need to add
a way for the new VM instance to add to steal time, possibly through a
new uAPI.

It is also not clear to me why we would want that.

>
> > Change-Id: I18d1d17d4d0d6f4c89b312e427036e052c47e1fa
>
> gerrit.
>
> > Signed-off-by: Suleiman Souhlal <suleiman@...gle.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  1 +
> >  arch/x86/kvm/x86.c              | 11 ++++++++++-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index e159e44a6a1b61..01d44d527a7f88 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -897,6 +897,7 @@ struct kvm_vcpu_arch {
> >               u8 preempted;
> >               u64 msr_val;
> >               u64 last_steal;
> > +             u64 last_suspend_ns;
> >               struct gfn_to_hva_cache cache;
> >       } st;
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index c8160baf383851..12439edc36f83a 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -3650,7 +3650,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >       struct kvm_steal_time __user *st;
> >       struct kvm_memslots *slots;
> >       gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
> > -     u64 steal;
> > +     u64 steal, suspend_ns;
> >       u32 version;
> >
> >       if (kvm_xen_msr_enabled(vcpu->kvm)) {
> > @@ -3677,6 +3677,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >                       return;
> >       }
> >
> > +     suspend_ns = kvm_total_suspend_ns();
> >       st = (struct kvm_steal_time __user *)ghc->hva;
> >       /*
> >        * Doing a TLB flush here, on the guest's behalf, can avoid
> > @@ -3731,6 +3732,13 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >       steal += current->sched_info.run_delay -
> >               vcpu->arch.st.last_steal;
> >       vcpu->arch.st.last_steal = current->sched_info.run_delay;
> > +     /*
> > +      * Include the time that the host was suspended in steal time.
> > +      * Note that the case of a suspend happening during a VM migration
> > +      * might not be accounted.
> > +      */
>
> This is not a useful comment.  It's quite clear what that suspend time is being
> accumulated into steal_time, and restating the migration caveat does more harm
> than good, as that flaw is an issue with the overall design, i.e. has nothing to
> do with this specific snippet of code.

I will remove it.

Thanks,
-- Suleiman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ