linux-kernel - Re: fair clock use in CFS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070514150800.GB29307@elte.hu>
Date:	Mon, 14 May 2007 17:08:00 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Daniel Hazelton <dhazelton@...er.net>
Cc:	William Lee Irwin III <wli@...omorphy.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>, efault@....de,
	tingy@...umass.edu, linux-kernel@...r.kernel.org
Subject: Re: fair clock use in CFS


* Daniel Hazelton <dhazelton@...er.net> wrote:

> [...] In a fair scheduler I'd expect all tasks to get the exact same 
> amount of time on the processor. So if there are 10 tasks running at 
> nice 0 and the current task has run for 20msecs before a new task is 
> swapped onto the CPU, the new task and *all* other tasks waiting to 
> get onto the CPU should get the same 20msecs. [...]

What happens in CFS is that in exchange for this task's 20 msecs the 
other tasks get 2 msecs each. (and not only the one that gets on the CPU 
next) So each task is handled equal. What i described was the first step 
- for each task the same step happens (whenever it gets on the CPU, and 
accounted/weighted for the precise period they spent waiting - so the 
second task would get +4 msecs credited, the third task +6 msecs, etc., 
etc.).

but really - nothing beats first-hand experience: please just boot into 
a CFS kernel and test its precision a bit. You can pick it up from the 
usual place:

   http://people.redhat.com/mingo/cfs-scheduler/

For example start 10 CPU hogs at once from a shell:

   for (( N=0; N < 10; N++ )); do ( while :; do :; done ) & done

[ type 'killall bash' in the same shell to get rid of them. ]

then watch their CPU usage via 'top'. While the system is otherwise idle 
you should get something like this after half a minute of runtime:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2689 mingo     20   0  5968  560  276 R 10.0  0.1   0:03.45 bash
 2692 mingo     20   0  5968  564  280 R 10.0  0.1   0:03.45 bash
 2693 mingo     20   0  5968  564  280 R 10.0  0.1   0:03.45 bash
 2694 mingo     20   0  5968  564  280 R 10.0  0.1   0:03.45 bash
 2695 mingo     20   0  5968  564  280 R 10.0  0.1   0:03.45 bash
 2698 mingo     20   0  5968  564  280 R 10.0  0.1   0:03.45 bash
 2690 mingo     20   0  5968  564  280 R  9.9  0.1   0:03.45 bash
 2691 mingo     20   0  5968  564  280 R  9.9  0.1   0:03.45 bash
 2696 mingo     20   0  5968  564  280 R  9.9  0.1   0:03.45 bash
 2697 mingo     20   0  5968  564  280 R  9.9  0.1   0:03.45 bash

with each task having exactly the same 'TIME+' field in top. (the more 
equal those fields, the more precise/fair the scheduler is. In the above 
output each got its precise share of 3.45 seconds of CPU time.)

then as a next phase of testing please run various things on the system 
(without stopping these loops) and try to get CFS "out of balance" - 
you'll succeed if you manage to get an unequal 'TIME+' field for them. 
Please try _really_ hard to break it. You can run any workload.

Or try massive_intr.c from:

   http://lkml.org/lkml/2007/3/26/319

which uses a much less trivial scheduling pattern to test a scheduler's 
precision of scheduling)

 $ ./massive_intr 9 10
 002765  00000125
 002767  00000125
 002762  00000125
 002769  00000125
 002768  00000126
 002761  00000126
 002763  00000126
 002766  00000126
 002764  00000126

(the second column is runtime - the more equal, the more precise/fair 
the scheduler.)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/