[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABCjUKAgH7ryZed=FP0GP84GTjeMRyQbjhP2pSsJ3Ksp63D7fA@mail.gmail.com>
Date: Thu, 18 Jul 2024 02:03:02 +0900
From: Suleiman Souhlal <suleiman@...gle.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Sean Christopherson <seanjc@...gle.com>, Joel Fernandes <joel@...lfernandes.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Vineeth Remanan Pillai <vineeth@...byteword.org>, Ben Segall <bsegall@...gle.com>,
Borislav Petkov <bp@...en8.de>, Daniel Bristot de Oliveira <bristot@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
"H . Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Mel Gorman <mgorman@...e.de>, Paolo Bonzini <pbonzini@...hat.com>, Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
Vitaly Kuznetsov <vkuznets@...hat.com>, Wanpeng Li <wanpengli@...cent.com>,
Masami Hiramatsu <mhiramat@...nel.org>, himadrics@...ia.fr, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, x86@...nel.org, graf@...zon.com,
drjunior.org@...il.com
Subject: Re: [RFC PATCH v2 0/5] Paravirt Scheduling (Dynamic vcpu priority management)
On Thu, Jul 18, 2024 at 12:20 AM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Wed, 17 Jul 2024 10:52:33 -0400
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > We could possibly add a new sched class that has a dynamic priority.
>
> It wouldn't need to be a new sched class. This could work with just a
> task_struct flag.
>
> It would only need to be checked in pick_next_task() and
> try_to_wake_up(). It would require that the shared memory has to be
> allocated by the host kernel and always present (unlike rseq). But this
> coming from a virtio device driver, that shouldn't be a problem.
>
> If this flag is set on current, then the first thing that
> pick_next_task() should do is to see if it needs to change current's
> priority and policy (via a callback to the driver). And then it can
> decide what task to pick, as if current was boosted, it could very well
> be the next task again.
>
> In try_to_wake_up(), if the task waking up has this flag set, it could
> boost it via an option set by the virtio device. This would allow it to
> preempt the current process if necessary and get on the CPU. Then the
> guest would be require to lower its priority if it the boost was not
> needed.
>
> Hmm, this could work.
For what it's worth, I proposed something somewhat conceptually similar before:
https://lore.kernel.org/kvm/CABCjUKBXCFO4-cXAUdbYEKMz4VyvZ5hD-1yP9H7S7eL8XsqO-g@mail.gmail.com/T/
Guests VCPUs would report their preempt_count to the host and the host
would use that to try not to preempt a VCPU that was in a critical
section (with some simple safeguards in case the guest was not well
behaved).
(It worked by adding a "may_preempt" notifier that would get called in
schedule(), whose return value would determine whether we'd try to
schedule away from current or not.)
It was VM specific, but the same idea could be made to work for
generic userspace tasks.
-- Suleiman
Powered by blists - more mailing lists