lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 31 Jul 2013 11:53:06 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Jason Low <jason.low2@...com>
Cc:	Ingo Molnar <mingo@...hat.com>, KML <linux-kernel@...r.kernel.org>,
	Mike Galbraith <efault@....de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul Turner <pjt@...gle.com>, Alex Shi <alex.shi@...el.com>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Namhyung Kim <namhyung@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	aswin@...com, scott.norton@...com, chegu_vinod@...com,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH] sched: Reduce overestimating avg_idle

On Wed, Jul 31, 2013 at 02:37:52AM -0700, Jason Low wrote:
> The avg_idle value may sometimes be overestimated, which may cause new idle
> load balance to be attempted more often than it should. Currently, when
> avg_idle gets updated, if the delta exceeds some max value (default 1000000 ns),
> the entire avg gets set to the max value, regardless of what the previous avg
> was. So if a CPU remains idle for 200,000 ns most of the time, and if the CPU
> goes idle for 1,200,000 ns, the average is then pushed up to 1,000,000 ns when
> it should be less.
> 
> Additionally, once the avg_idle is at its max, it may take a while to pull the
> avg down to a value that it should be. In the above example, after the avg idle
> is set the max value of 1000000 ns, the CPU's idle durations needs to
> be 200000 ns for the next 8 occurrences before the avg falls below the migration
> cost value.
> 
> This patch attempts to avoid these situations by always updating the avg_idle
> value first with the function call to update_avg(). Then, if the avg_idle
> exceeds the max avg value, the avg gets set to the max. Also, this patch lowers
> the max avg_idle value to migration_cost * 1.5 instead of migration_cost * 2 to
> reduce the time it takes to pull the avg idle to a lower value after long idles.

Indeed, this seems quite sensible.

> With this change, I got some decent performance boosts in AIM7 workloads on an
> 8 socket machine on the 3.10 kernel. In particular, it boosted the AIM7 fserver
> workload by about 20% when running it with a high # of users.

Nice :-)

> An avg_idle related question that I have is does migration_cost in idle balance
> need to be the same as the migration_cost in task_hot()? Can we keep
> migration_cost default value used in task_hot() the same, but have a different
> default value or increase migration_cost only when comparing it with avg_idle in
> idle balance?

No they're quite unrelated. I think you can measure the max time we've
ever spend in newidle balance and use that to clip the values.

Similarly, I've thought about how we updated the sd->avg_cost in the
previous patches and wondered if we should not track max_cost.

The 'only' down-side I could come up with is that its all ran from
SoftIRQ context which means IRQ/NMI/SMI can all stretch/warp the time it
takes to actually do the idle balance.

The idea behind using the max is that we want to reduce the chance we
overrun the averages and consume time we should have spend doing useful
work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ