linux-kernel - Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170712083410.ualmvnvzoohyami5@hirez.programming.kicks-ass.net>
Date:   Wed, 12 Jul 2017 10:34:10 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Li, Aubrey" <aubrey.li@...ux.intel.com>
Cc:     Frederic Weisbecker <fweisbec@...il.com>,
        Christoph Lameter <cl@...ux.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Aubrey Li <aubrey.li@...el.com>, tglx@...utronix.de,
        len.brown@...el.com, rjw@...ysocki.net, tim.c.chen@...ux.intel.com,
        arjan@...ux.intel.com, paulmck@...ux.vnet.ibm.com,
        yang.zhang.wz@...il.com, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

On Wed, Jul 12, 2017 at 12:15:08PM +0800, Li, Aubrey wrote:
> Okay, the difference is that Mike's patch uses a very simple algorithm to make the decision.

No, the difference is that we don't end up with duplication of a metric
ton of code.

It uses the normal idle path, it just makes the NOHZ enter fail.

The condition Mike uses is why that patch never really went anywhere and
needs work.

For the condition I tend to prefer something auto-adjusting vs a tunable
threshold that everybody + dog needs to manually adjust.

So add something that measures the cost of tick_nohz_idle_{enter,exit}()
and base the threshold off of that. Then of course, there's the question
which of the idle estimates to use.

The cpuidle idle estimate includes IRQs, which is important for actual
idle states, but not all interrupts re-enable the tick.

The scheduler idle estimate only considers task activity, which tends to
re-enable the tick.

So the cpuidle estimate is pessimistic in that it'll vastly under
estimate the actual nohz period, while the scheduler estimate will over
estimate. I suspsect the scheduler one is closer to the actual nohz
duration, but this is something we'll have to play with.