[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6a2d2e20902100646q6c7073cse86b8f5790a120ac@mail.gmail.com>
Date: Tue, 10 Feb 2009 14:46:40 +0000
From: Rolando Martins <rolando.martins@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: cgroup, RT reservation per core(s)?
On 2/10/09, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
> > I should have elaborated this more:
> >
> > root
> > ----|----
> > | |
> > (0.5 mem) 0 1 (100% rt, 0.5 mem)
> > ---------
> > | | |
> > 2 3 4 (33% rt for each group, 33% mem
> > per group(0.165))
> > Rol
>
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ___R___
> / \
> A B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ____1____
> / | \
> 2 3 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>
On 2/10/09, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
> > I should have elaborated this more:
> >
> > root
> > ----|----
> > | |
> > (0.5 mem) 0 1 (100% rt, 0.5 mem)
> > ---------
> > | | |
> > 2 3 4 (33% rt for each group, 33% mem
> > per group(0.165))
> > Rol
>
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ___R___
> / \
> A B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ____1____
> / | \
> 2 3 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>
Thanks Peter, it works!
I am thinking for different strategies to be used in my rt middleware
project, and I think there is a limitation.
If I wanted to have some RT on the B cpuset, I couldn't because I
assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).
This happens because there is a global /proc/sys/kernel/sched_rt_runtime_us and
/proc/sys/kernel/sched_rt_period_us.
What do you think about adding a separate tuple (runtime,period) for
each core/cpu?
In this case:
/proc/sys/kernel/sched_rt_runtime_us_0
/proc/sys/kernel/sched_rt_period_us_0
...
/proc/sys/kernel/sched_rt_runtime_us_n (n, cpu count)
/proc/sys/kernel/sched_rt_period_us_n
Given this, we could the following:
mkdir /dev/cgroup/A
echo 0-1 > /dev/cgroup/A/cpuset.cpus
echo 0 > /dev/cgroup/A/cpuset.mems
echo 1000000 > /dev/cgroup/A/cpu.rt_period_us
echo 1000000 > /dev/cgroup/A/cpu.rt_runtime_us
This would only work if we could allocate
(cpu.rt_runtime_us,cpu.rt_period_us) in both CPU 0 and CPU 1,
otherwise fail.
mkdir /dev/cgroup/B
echo 2-3 > /dev/cgroup/B/cpuset.cpus
echo 0 > /dev/cgroup/B/cpuset.mems
echo 1000000 > /dev/cgroup/B/cpu.rt_period_us
echo 800000 > /dev/cgroup/B/cpu.rt_runtime_us
The same here, failed if we couldn't allocate 0.8 in both CPU 2 and CPU 3.
Does this make sense? ;)
Rol
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists