[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070521085703.GA18755@elte.hu>
Date: Mon, 21 May 2007 10:57:03 +0200
From: Ingo Molnar <mingo@...e.hu>
To: William Lee Irwin III <wli@...omorphy.com>
Cc: Dmitry Adamushko <dmitry.adamushko@...il.com>,
Peter Williams <pwil3058@...pond.net.au>,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [patch] CFS scheduler, -v12
* William Lee Irwin III <wli@...omorphy.com> wrote:
> cfs should probably consider aggregate lag as opposed to aggregate
> weighted load. Mainline's convergence to proper CPU bandwidth
> distributions on SMP (e.g. N+1 tasks of equal nice on N cpus) is
> incredibly slow and probably also fragile in the presence of arrivals
> and departures partly because of this. [...]
hm, have you actually tested CFS before coming to this conclusion?
CFS is fair even on SMP. Consider for example the worst-case
3-tasks-on-2-CPUs workload on a 2-CPU box:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2658 mingo 20 0 1580 248 200 R 67 0.0 0:56.30 loop
2656 mingo 20 0 1580 252 200 R 66 0.0 0:55.55 loop
2657 mingo 20 0 1576 248 200 R 66 0.0 0:55.24 loop
66% of CPU time for each task. The 'TIME+' column shows a 2% spread
between the slowest and the fastest loop after just 1 minute of runtime
(and the spread gets narrower with time). Mainline does a 50% / 50% /
100% split:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3121 mingo 25 0 1584 252 204 R 100 0.0 0:13.11 loop
3120 mingo 25 0 1584 256 204 R 50 0.0 0:06.68 loop
3119 mingo 25 0 1584 252 204 R 50 0.0 0:06.64 loop
and i fixed that in CFS.
or consider a sleepy workload like massive_intr, 3-tasks-on-2-CPUs:
europe:~> head -1 /proc/interrupts
CPU0 CPU1
europe:~> ./massive_intr 3 10
002623 00000722
002621 00000720
002622 00000721
Or a 5-tasks-on-2-CPS workload:
europe:~> ./massive_intr 5 50
002649 00002519
002653 00002492
002651 00002478
002652 00002510
002650 00002478
that's around 1% of spread.
load-balancing is a performance vs. fairness tradeoff so we wont be able
to make it precisely fair because that's hideously expensive on SMP
(barring someone showing a working patch of course) - but in CFS i got
quite close to having it very fair in practice.
> [...] Tong Li's DWRR repairs the deficit in mainline by synchronizing
> epochs or otherwise bounding epoch dispersion. This doesn't directly
> translate to cfs. In cfs cpu should probably try to figure out if its
> aggregate lag (e.g. via minimax) is above or below average, and push
> to or pull from the other half accordingly.
i'd first like to see a demonstration of a problem to solve, before
thinking about more complex solutions ;-)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists