lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 30 Jul 2021 18:09:12 +0900
From:   Suleiman Souhlal <suleiman@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Suleiman Souhlal <ssouhlal@...ebsd.org>,
        Joel Fernandes <joelaf@...gle.com>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        kvm@...r.kernel.org
Subject: Re: [RFC PATCH 0/2] KVM: Support Heterogeneous RT VCPU Configurations.

Hi Peter,

On Wed, Jul 28, 2021 at 5:11 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Jul 28, 2021 at 04:36:58PM +0900, Suleiman Souhlal wrote:
> > Hello,
> >
> > This series attempts to solve some issues that arise from
> > having some VCPUs be real-time while others aren't.
> >
> > We are trying to play media inside a VM on a desktop environment
> > (Chromebooks), which requires us to have some tasks in the guest
> > be serviced at real-time priority on the host so that the media
> > can be played smoothly.
> >
> > To achieve this, we give a VCPU real-time priority on the host
> > and use isolcpus= to ensure that only designated tasks are allowed
> > to run on the RT VCPU.
>
> WTH do you need isolcpus for that? What's wrong with cpusets?

I regret mentioning isolcpus here.
The patchset doesn't dictate how the guest is supposed to use RT.
cpusets also work.

> > In order to avoid priority inversions (for example when the RT
> > VCPU preempts a non-RT that's holding a lock that it wants to
> > acquire), we dedicate a host core to the RT vcpu: Only the RT
> > VCPU is allowed to run on that CPU, while all the other non-RT
> > cores run on all the other host CPUs.
> >
> > This approach works on machines that have a large enough number
> > of CPUs where it's possible to dedicate a whole CPU for this,
> > but we also have machines that only have 2 CPUs and doing this
> > on those is too costly.
> >
> > This patch series makes it possible to have a RT VCPU without
> > having to dedicate a whole host core for it.
> > It does this by making it so that non-RT VCPUs can't be
> > preempted if they are in a critical section, which we
> > approximate as having interrupts disabled or non-zero
> > preempt_count. Once the VCPU is found to not be in a critical
> > section anymore, it will give up the CPU.
> > There measures to ensure that preemption isn't delayed too
> > many times.
> >
> > (I realize that the hooks in the scheduler aren't very
> > tasteful, but I couldn't figure out a better way.
> > SVM support will be added when sending the patch for
> > inclusion.)
> >
> > Feedback or alternatives are appreciated.
>
> This is disguisting and completely wrecks the host scheduling. You're
> placing guest over host, that's fundamentally wrong.

I understand the sentiment.

For what it's worth, the patchset doesn't completely rely on a
well-behaved guest: It only delays preemption a bounded number of
times, after which it yields back no matter what.

> NAK!
>
> If you want co-ordinated RT scheduling, look at paravirtualized deadline
> scheduling.

Thanks for the suggestion, I will look into it.

-- Suleiman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ