linux-kernel - Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f2b55d220704161610l71c72bc6k364bd9396cd59300@mail.gmail.com>
Date:	Mon, 16 Apr 2007 16:10:59 -0700
From:	"Michael K. Edwards" <medwards.linux@...il.com>
To:	"Peter Williams" <pwil3058@...pond.net.au>
Cc:	"William Lee Irwin III" <wli@...omorphy.com>,
	"Ingo Molnar" <mingo@...e.hu>, "Matt Mackall" <mpm@...enic.com>,
	"Con Kolivas" <kernel@...ivas.org>, linux-kernel@...r.kernel.org,
	"Linus Torvalds" <torvalds@...ux-foundation.org>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Nick Piggin" <npiggin@...e.de>, "Mike Galbraith" <efault@....de>,
	"Arjan van de Ven" <arjan@...radead.org>,
	"Thomas Gleixner" <tglx@...utronix.de>
Subject: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

On 4/16/07, Peter Williams <pwil3058@...pond.net.au> wrote:
> Note that I talk of run queues
> not CPUs as I think a shift to multiple CPUs per run queue may be a good
> idea.

This observation of Peter's is the best thing to come out of this
whole foofaraw.  Looking at what's happening in CPU-land, I think it's
going to be necessary, within a couple of years, to replace the whole
idea of "CPU scheduling" with "run queue scheduling" across a complex,
possibly dynamic mix of CPU-ish resources.  Ergo, there's not much
point in churning the mainline scheduler through a design that isn't
significantly more flexible than any of those now under discussion.

For instance, there are architectures where several "CPUs"
(instruction stream decoders feeding execution pipelines) share parts
of a cache hierarchy ("chip-level multitasking").  On these machines,
you may want to co-schedule a "real" processing task on one pipeline
with a "cache warming" task on the other pipeline -- but only for
tasks whose memory access patterns have been sufficiently analyzed to
write the "cache warming" task code.  Some other tasks may want to
idle the second pipeline so they can use the full cache-to-RAM
bandwidth.  Yet other tasks may be genuinely CPU-intensive (or I/O
bound but so context-heavy that it's not worth yielding the CPU during
quick I/Os), and hence perfectly happy to run concurrently with an
unrelated task on the other pipeline.

There are other architectures where several "hardware threads" fight
over parts of a cache hierarchy (sometimes bizarrely described as
"sharing" the cache, kind of the way most two-year-olds "share" toys).
 On these machines, one instruction pipeline can't help the other
along cache-wise, but it sure can hurt.  A scheduler designed, tested,
and tuned principally on one of these architectures (hint:
"hyperthreading") will probably leave a lot of performance on the
floor on processors in the former category.

In the not-so-distant future, we're likely to see architectures with
dynamically reconfigurable interconnect between instruction issue
units and execution resources.  (This is already quite feasible on,
say, Virtex4 FX devices with multiple PPC cores, or Altera FPGAs with
as many Nios II cores as fit on the chip.)  Restoring task context may
involve not just MMU swaps and FPU instructions (with state-dependent
hidden costs) but processsor reconfiguration.  Achieving "fairness"
according to any standard that a platform integrator cares about (let
alone an end user) will require a fairly detailed model of the hidden
costs associated with different sorts of task switch.

So if you are interested in schedulers for some reason other than a
paycheck, let the distros worry about 5% improvements on x86[_64].
Get hold of some different "hardware" -- say:
  - a Xilinx ML410 if you've got $3K to blow and want to explore
reconfigurable processors;
  - a SunFire T2000 if you've got $11K and want to mess with a CMT
system that's actually shipping;
  - a QEMU-simulated massively SMP x86 if you're poor but clever
enough to implement funky cross-core cache effects yourself; or
  - a cycle-accurate simulator from Gaisler or Virtio if you want a
real research project.
Then go explore some more interesting regions of parameter space and
see what the demands on mainline Linux will look like in a few years.

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/