linux-kernel - Re: [RFC] sched: Limit idle_balance() when it is being used too frequently

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130717093913.GP23818@dyad.programming.kicks-ass.net>
Date:	Wed, 17 Jul 2013 11:39:13 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Jason Low <jason.low2@...com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Mike Galbraith <efault@....de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul Turner <pjt@...gle.com>, Alex Shi <alex.shi@...el.com>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Namhyung Kim <namhyung@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	aswin@...com, scott.norton@...com, chegu_vinod@...com
Subject: Re: [RFC] sched: Limit idle_balance() when it is being used too
 frequently

On Wed, Jul 17, 2013 at 01:11:41AM -0700, Jason Low wrote:
> For the more complex model, are you suggesting that each completion time
> is the time it takes to complete 1 iteration of the for_each_domain()
> loop? 

Per sd, yes? So higher domains (or lower depending on how you model the thing
in you head) have bigger CPU spans, and thus take longer to complete. Imagine
the top domain of a 4096 cpu system, it would go look at all cpus to see if it
could find a task.

> Based on some of the data I collected, a single iteration of the
> for_each_domain() loop is almost always significantly lower than the
> approximate CPU idle time, even in workloads where idle_balance is
> lowering performance. The bigger issue is that it takes so many of these
> attempts before idle_balance actually "worked" and pulls a tasks.

I'm confused, so:

  schedule()
    if (!rq->nr_running)
      idle_balance()
        for_each_domain(sd)
          load_balance(sd)

is the entire thing, there's no other loop in there.

> I initially was thinking about each "completion time" of an idle balance
> as the sum total of the times of all iterations to complete until a task
> is successfully pulled within each domain.

So you're saying that normally idle_balance() won't find a task to pull? And we
need many times going newidle before we do get something?

Wouldn't this mean that there simply weren't enough tasks to keep all cpus busy?

If there were tasks we could've pulled, we might need to look at why they
weren't and maybe fix that. Now it could be that it things this cpu, even with
the (little) idle time it has is sufficiently loaded and we'll get a 'local'
wakeup soon enough. That's perfectly fine.

What we should avoid is spending more time looking for tasks then we have idle,
since that reduces the total time we can spend doing useful work. So that is I
think the critical cut-off point.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/