linux-kernel - Re: [PATCH RFC hack dont apply] intel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1710032254020.2278@nanos>
Date:   Tue, 3 Oct 2017 23:02:55 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Jacob Pan <jacob.jun.pan@...ux.intel.com>
cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Yang Zhang <yang.zhang.wz@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        kvm@...r.kernel.org, Wanpeng Li <wanpeng.li@...mail.com>,
        Paolo Bonzini <pbonzini@...hat.com>, rkrcmar@...hat.com,
        dmatlack@...gle.com, agraf@...e.de,
        Peter Zijlstra <peterz@...radead.org>,
        Len Brown <lenb@...nel.org>,
        Linux PM <linux-pm@...r.kernel.org>
Subject: Re: [PATCH RFC hack dont apply] intel_idle: support running within
 a VM

On Mon, 2 Oct 2017, Jacob Pan wrote:
> On Sat, 30 Sep 2017 01:21:43 +0200
> "Rafael J. Wysocki" <rafael@...nel.org> wrote:
> 
> > On Sat, Sep 30, 2017 at 12:01 AM, Michael S. Tsirkin <mst@...hat.com>
> > wrote:
> > > intel idle driver does not DTRT when running within a VM:
> > > when going into a deep power state, the right thing to
> > > do is to exit to hypervisor rather than to keep polling
> > > within guest using mwait.
> > >
> > > Currently the solution is just to exit to hypervisor each time we go
> > > idle - this is why kvm does not expose the mwait leaf to guests even
> > > when it allows guests to do mwait.
> > >
> > > But that's not ideal - it seems better to use the idle driver to
> > > guess when will the next interrupt arrive.  
> > 
> > The idle driver alone is not sufficient for that, though.
> > 
> I second that. Why try to solve this problem at vendor specific driver
> level? perhaps just a pv idle driver that decide whether to vmexit
> based on something like local per vCPU timer expiration? I guess we
> can't predict other wake events such as interrupts.
> e.g.
> if (get_next_timer_interrupt() > kvm_halt_target_residency)

Bah. no. get_next_timer_interrupt() is not available for abuse in random
cpuidle driver code. It has state and its tied to the nohz code.

There is the series from Audrey which makes use of the various idle
prediction mechanisms, scheduler, irq timings, idle governor to get an idea
about the estimated idle time. Exactly this information can be fed to the
kvmidle driver which can act accordingly.

Hacking a random hardware specific idle driver is definitely the wrong
approach. It might be useful to chain the kvmidle driver and hardware
specific drivers at some point, i.e. if the kvmdriver decides not to exit
it delegates the mwait decision to the proper hardware driver in order not
to reimplement all the required logic again. But that's a different story.

See http://lkml.kernel.org/r/1506756034-6340-1-git-send-email-aubrey.li@intel.com

Thanks,

	tglx