[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8B5MMCzBGwkTT0X@google.com>
Date: Thu, 27 Feb 2025 06:39:44 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Fernand Sieber <sieberf@...zon.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>, "mingo@...hat.com" <mingo@...hat.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"nh-open-source@...zon.com" <nh-open-source@...zon.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [RFC PATCH 0/3] kvm,sched: Add gtime halted
On Thu, Feb 27, 2025, Fernand Sieber wrote:
> On Wed, 2025-02-26 at 13:00 -0800, Sean Christopherson wrote:
> > On Wed, Feb 26, 2025, Fernand Sieber wrote:
> > > On Tue, 2025-02-25 at 18:17 -0800, Sean Christopherson wrote:
> > > > And if you're running vCPUs on tickless CPUs, and you're doing
> > > > HLT/MWAIT passthrough, *and* you want to schedule other tasks on those
> > > > CPUs, then IMO you're abusing all of those things and it's not KVM's
> > > > problem to solve, especially now that sched_ext is a thing.
> > >
> > > We are running vCPUs with ticks, the rest of your observations are
> > > correct.
> >
> > If there's a host tick, why do you need KVM's help to make scheduling
> > decisions? It sounds like what you want is a scheduler that is primarily
> > driven by MPERF (and APERF?), and sched_tick() => arch_scale_freq_tick()
> > already knows about MPERF.
>
> Having the measure around VM enter/exit makes it easy to attribute the
> unhalted cycles to a specific task (vCPU), which solves both our use
> cases of VM metrics and scheduling. That said we may be able to avoid
> it and achieve the same results.
>
> i.e
> * the VM metrics use case can be solved by using /proc/cpuinfo from
> userspace.
> * for the scheduling use case, the tick based sampling of MPERF means
> we could potentially introduce a correcting factor on PELT accounting
> of pinned vCPU tasks based on its value (similar to what I do in the
> last patch of the series).
>
> The combination of these would remove the requirement of adding any
> logic around VM entrer/exit to support our use cases.
>
> I'm happy to prototype that if we think it's going in the right
> direction?
That's mostly a question for the scheduler folks. That said, from a KVM perspective,
sampling MPERF around entry/exit for scheduling purposes is a non-starter.
Powered by blists - more mailing lists