linux-kernel - Re: [RFC, PATCH 0/5] Going forward with Resource Management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 4 Aug 2006 16:46:38 +0530
From:	Srivatsa Vaddagiri <vatsa@...ibm.com>
To:	Andrew Morton <akpm@...l.org>
Cc:	mingo@...e.hu, nickpiggin@...oo.com.au, sam@...ain.net,
	linux-kernel@...r.kernel.org, dev@...nvz.org, efault@....de,
	balbir@...ibm.com, sekharan@...ibm.com, nagar@...son.ibm.com,
	haveblue@...ibm.com, pj@....com
Subject: Re: [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller

On Fri, Aug 04, 2006 at 12:13:42AM -0700, Andrew Morton wrote:
> There was a lot of discussion last time - Mike, Ingo, others.  It would be
> a useful starting point if we could be refreshed on what the main issues
> were, and whether/how this new patchset addresses them.

The main issues raised against the CPU controller posted last time were
these:

(ref : http://lkml.org/lkml/2006/4/20/404)

a. Interactive tasks not handled
	The patch, which was mainly based on scaling down timeslice of tasks 
	that are above their guarantee, left interactive tasks untouched. This 
	meant that interactive tasks could run uncontrolled and would have 
	affected the guaranteed bandwidth provided for other tasks.

b. Task groups with uncontrolled number of tasks not handled well
	The patch retained current single runqueue per-cpu. Thus the runqueue 
	would contain a mixture of tasks belonging to different groups. Also 
	each task was given a minimum timeslice of 1 tick. This meant that we 
	could not limit the CPU bandwidth of a group that has a large number of 
	tasks to the desired extent.

c. SMP-correctness not implemented
	 Guaranteed bandwidth wasn't observed on all CPUs put together

d. Supported only guaranteed bandwidth and not soft/hard limit.

e. Bursty workloads not handled well
	Scaling down of timeslice, to meet the increased demand of 
	higher-guaranteed task-groups, was not instantaneous. Rather 
   	timeslice was scaled down when tasks expired their timeslice
   	and were moved to expired array. This meant that bursty workloads
   	would get their due share rather slowly.

Apart from these, the other observation I had was that:

f. Incorrect load accounting?
	Load of a task was accounted only when it expired its timeslice, rather 
	than while it was running. This IMO can lead to improper observation of 
	load a task-group has on a given CPU at times and thus affect
	guaranteed bandwidth for other task-groups.

Could we have overcome all these issue with slight changes to the
design? Hard to say. IMHO we get better control only by segregating tasks
into different runqueues and getting control over which task-group to
schedule next, which is what this new patch attempts to do.

In summary, the patch should address limitations a, b, e and f. I am hoping to 
address c using smpnice approach. Regarding d, this patch provides more
of a soft-limit feature. Some guaranteed usage for task-groups can still
be met, I feel, by limiting the CPU usage of other groups.

To take all this forward, these significant points need to be decided
for a CPU controller:

1. Do we want to split the current 1 runqueue per-CPU into 1 runqueue
   per-task-group per-CPU?

2. How should task-group priority be decided? The decision we take for
   this impacts interactivity of the system. In my patch, I attempt to
   retain good interactivty by letting task-group priority be decided by
   the highest priority task it has.

3. How do we accomplish SMP correctness for task-group bandwidth?
   I believe OpenVZ uses virtual runqueues, which simplifies 
   load balancing a bit, though not sure if that is at the expense
   of increased lock contention. IMHO we can try going smpnice route and
   see how far that can take us.

Ingo/Nick, what are your thoughts here?

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/