lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 29 Jun 2011 17:07:15 -0700
From:	Nikhil Rao <ncrao@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Alex, Shi" <alex.shi@...el.com>, "mingo@...e.hu" <mingo@...e.hu>,
	"Chen, Tim C" <tim.c.chen@...el.com>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	len.brown@...el.com
Subject: Re: power increase issue on light load

On Tue, Jun 28, 2011 at 7:30 PM, Nikhil Rao <ncrao@...gle.com> wrote:
> Looking at the schedstat data Alex posted:
> - Distribution of load balances across cores looks about the same.
> - Load balancer does more idle balances on 3.0-rc4 as compared to
> 2.6.39 on SMT and NUMA domains. Busy and newidle balances are a mixed
> bag.
> - I see far fewer affine wakeups on 3.0-rc4 as compared to 2.6.39.
> About half as many affine wakeups on SMT and about a quarter as many
> on NUMA.
>
> I'm investigating the impact of the load resolution patchset on
> effective load and wake affine calculations. This seems to be the most
> obvious difference from the schedstat data.
>

I went through the math in effective load and wake affine and I think
it should be OK. There are a couple of corner cases where increasing
sched load resolution can change the result of wake affine -- I've
listed them below. However, I not convinced you are hitting these
cases often enough to make a noticeable difference. I'm looking into
the other LB paths...

- One corner case is because of rounding error in the shares update
path. Let's say the shares update logic assigned weight A to a sched
entity in the case with scaled resolution, and it assigned weight B
without scaling weights. Now, we expect A/1024 = B, but this is not
always the case because of rounding error. The difference between (A
and B*1024) gets amplified in wake_affine() since it multiplies
(weight+effective load) with imbalance pct and cpu power -- we
effectively scale this up by 5 orders of magnitude. In cases where
prev_eff_load and this_eff_load are pretty close, this difference can
result in a different result in wake_affine().

- There's a corner case in effective_load(), where if a task wakes up
on an empty cfs_rq, you could hit the clamp in effective_load (i.e. <
MIN_SHARES) which can affect prev_eff_load (you get a lower number --
making it less likely to do an affine wakeup). I think this patch
(against 3.0-rc4) will address that issue -- can you please give this
a try?

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 433491c..6fcfbfc 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1442,8 +1442,8 @@ static long effective_load(struct task_group
*tg, int cpu, long wl, long wg)
                        wl = tg->shares;

                /* zero point is MIN_SHARES */
-               if (wl < MIN_SHARES)
-                       wl = MIN_SHARES;
+               if (wl < scale_load(MIN_SHARES))
+                       wl = scale_load(MIN_SHARES);
                wl -= se->load.weight;
                wg = 0;
        }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ