linux-kernel - Re: [Announce] [patch] Modular Scheduler Core and Completely Fair

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <200704190811.17290.philipp.marek@bmlv.gv.at>
Date:	Thu, 19 Apr 2007 08:11:16 +0200
From:	"Ph. Marek" <philipp.marek@...v.gv.at>
To:	linux-kernel@...r.kernel.org
Subject: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair

Pine.LNX.4.64.0704181515290.25880 () alien ! or ! mcafeemobile ! com

Davide Libenzi wrote:
> On Wed, 18 Apr 2007, Ingo Molnar wrote:
> > That's one reason why i dont think it's necessarily a good idea to
> > group-schedule threads, we dont really want to do a per thread group
> > percpu_alloc().
>
> I still do not have clear how much overhead this will bring into the
> table, but I think (like Linus was pointing out) the hierarchy should look
> like:
...
> The "run_queue" concept (and data) that now is bound to a CPU, need to be
> replicated in:
>
> ROOT <- VCPUs add themselves here
>     VCPU <- USERs add themselves here
>         USER <- PROCs add themselves here
>             PROC <- THREADs add themselves here
>                 THREAD (ultimate fine grained scheduling unit)
>
> So ROOT, VCPU, USER and PROC will have their own "run_queue". 
...

I can't comment on the internals about run_queues, overhead and so on, but 
these discussion leads me to the idea about a dynamic *tree* of scheduler 
queues.

With dynamic I mean that they are configured in user-space - be it with 
something like CLONE_NEW_SCHEDULER_CLASS, or possibly better some other 
interface to allow an *arbitrary* tree that is not coupled on the 
user/process/thread borders. New threads and processes are per default 
created in the parents queue, just like now.

So user-space could build an tree like this (eg with a pam module):

 Default queue - init
   +- kernel-thread queue (to avoid having kernel threads being blocked by
   |               user-space)
   +- cron, atd, sshd, .... unless they change their "class"
   +- user1
   |  +- X
   |  +- kde
   |  |  + konsole
   |  |  \ kmail
   |  |    + mail fetch thread
   |  |    + mail filter thread
   |  |    + GUI thread
   |  |  \- mplayer 
   \- user2
      +.....

Whether the queues are handled with some staircase behaviour, or CFS, or just 
get CPU time distributed by nice level, is another question - but they have 
to be "fair" only locally.

Of course, that's simply some sort of moving the problem into user-space - but 
I think (and read that often enough) that the needs vary so much that a 
single, hardcoded system won't suffice. And we can try to get the "right" 
behaviour in each queue, just like now.

Walking the tree might make the scheduler not fully O(1) - but per default 
only one queue is defined (or possibly two queues, one for kernel threads), 
and everything else can be done by user-space.

The mentioned case of a web-server with gzip started would be done with having 
each httpd being in a queue just below init, and having everything else in 
another - or by nicing the webserver, as it's defined as "important".
(I believe that's called "moving policy into userspace" :-)

Regards,

Phil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/