[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070611193735.GA22152@elte.hu>
Date: Mon, 11 Jun 2007 21:37:35 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@...oo.com.au>, efault@....de,
kernel@...ivas.org, containers@...ts.osdl.org,
ckrm-tech@...ts.sourceforge.net, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, pwil3058@...pond.net.au,
tingy@...umass.edu, tong.n.li@...el.com, wli@...omorphy.com,
linux-kernel@...r.kernel.org, dmitry.adamushko@...il.com,
balbir@...ibm.com, Kirill Korotaev <dev@...ru>
Subject: Re: [RFC][PATCH 0/6] Add group fairness to CFS - v1
* Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com> wrote:
> Ingo,
> Here's an update of the group fairness patch I have been
> working on. Its against CFS v16 (sched-cfs-v2.6.22-rc4-mm2-v16.patch).
thanks!
> The core idea is to reuse much of CFS logic to apply fairness at
> higher hierarchical levels (user, container etc). In this regard CFS
> engine has been modified to deal with generic 'schedulable entities'.
> The patches introduce two essential structures in CFS core:
>
> - struct sched_entity
> - represents a schedulable entity in a hierarchy. Task
> is the lowest element in this hierarchy. Its ancestors
> could be user, container etc. This structure stores
> essential attributes/execution-history (wait_runtime etc)
> which is required by CFS engine to provide fairness between
> 'struct sched_entities' at the same hierarchy.
>
> - struct lrq
> - represents (per-cpu) runqueue in which ready-to-run
> 'struct sched_entities' are queued. The fair clock
> calculation is split to be per 'struct lrq'.
>
> Here's a brief description of the patches to follow:
>
> Patches 1-3 introduce the essential changes in CFS core to support
> this concept. They rework existing code w/o any (intended!) change in
> functionality.
i currently have these 3 patches applied to the CFS queue and it's
looking pretty good so far! If it continues to be problem-free i'll
release them as part of -v17, just to check that they truly have no bad
side-effects (they shouldnt). Then #4 can go into -v18.
i've attached my current -v17 tree - it should apply mostly cleanly
ontop of the -mm queue (with a minor number of fixups). Could you
refactor the remaining 3 patches ontop of this base? There's some
rejects in the last 3 patches due to the update_load_fair() change.
> Patch 4 fixes some bad interaction between SCHED_RT and SCHED_NORMAL
> tasks in current CFS.
btw., the plan here is to turn off 'bit 0' in sched_features: i.e. to
use the precise statistics to calculate lrq->cpu_load[], not the
timer-irq-sampled imprecise statistics. Dmitry has fixed a couple of
bugs in it that made it not work too well in previous CFS versions, but
now we are ready to turn it on for -v17. (indeed in my tree it's already
turned on - i.e. sched_features defaults to '14')
> Patch 5 introduces basic changes in CFS core to support group
> fairness.
>
> Patch 6 hooks up scheduler with container patches in mm (as an
> interface for task-grouping functionality).
ok. Kirill, how do you like Srivatsa's current approach? Would be nice
to kill two birds with the same stone, if possible :-)
> Note: I have noticed that running lat_ctx in a loop for 10 times
> doesnt give me good results. Basically I expected the loop to take
> same time for both users (when run simultaneously), whereas it was
> taking different times for different users. I think this can be solved
> by increasing sysctl_sched_runtime_limit at group level (to remeber
> execution history over a longer period).
you'll get the best hackbench results by using SCHED_BATCH:
chrt -b 0 ./hackbench 10
or indeed increasing the runtime_limit would work too.
Ingo
View attachment "sched-cfs-v17-rc4.patch" of type "text/plain" (54064 bytes)
Powered by blists - more mailing lists