lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b647ffbd0806161059q7048ea40id0f4235ab19d6ca0@mail.gmail.com>
Date:	Mon, 16 Jun 2008 19:59:38 +0200
From:	"Dmitry Adamushko" <dmitry.adamushko@...il.com>
To:	"Gregory Haskins" <ghaskins@...ell.com>
Cc:	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	"Ingo Molnar" <mingo@...e.hu>,
	"Steven Rostedt" <rostedt@...dmis.org>,
	"Thomas Gleixner" <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org
Subject: Re: [sched-devel, patch-rfc] rework of "prioritize non-migratable tasks over migratable ones"

Hi Gregory,

> [ ... ]
>>> Doesn't this violate POSIX ?
>>>
>>
>> If so, then the idea of "prioritize non-migratable tasks over
>> migratable ones" violates it, not just an artefact of this particular
>> implementation.
>
> This isn't entirely accurate though.  Your proposal does alter the behavior from the original patch.  In fact, I put in the xqueue/squeue concept specifically to *avoid* the condition that you are creating.  I think this was the point Peter was making.
>
> However, to your point: I can agree that either implementation creates the possibility for creating strange scheduling artifacts.  And I think the point you are making is that once we have the artifact, whats another one? ;)

My guess was that Peter points out at the possibility of task_2
getting ahead of task_1, despite task_1's longer presence on the
run-queue and both tasks having equal priority.

IOW., it's not anymore a round-robin distribution for equal-prio
tasks (if that's how POSIX requires it... yeah, I'm speculating here
:-) but "migratable" vs. "non-migratable" in addition.

With your implementation, a "non-migratable" task gets ahead of
"migratable" (and possibly, some "migratable" that are on the queue
but for some reason have not been pulled/pushed yet).

With mine, one "non-migratable" can preempt another "non-migratable".

Partly, it could be fixed by : add a "non-migratable" task to the head
of the queue _only_ if the "current" task (or better -- if a
head-of-the-queue task) is "migratable".

One way or another, we have different aritifacts (and mine have likely
more) but conceptually, both "violates" POSIX if a strict round-robin
scheduling is required.

By "strict round-robin scheduling" I mean something as follows:

a woken-up task_N can get a CPU _only_ if none of other currently
pending tasks in the system (that's on all cpus and, sure, _RR and
_FIFO and of equal prio) can run on this CPU.

e.g. the following behavior of the load-balancer would be wrong (migth
have been possible with the 'old' scheduler/ load-balancer):

- task_1 is placed on CPU_N upon a wake-up with wake_idle();

- however, there were task_2 (currently running) and task_3 (pending) on CPU_M;

- task_3 was able to run on CPU_N _but_ has not been considered by the
load-balancer yet (let's say CPU_M became free just a moment ago, not
enough for the load-balancer to notice it).

i.e. task_1 gets ahead of task_3, although both have equal prios and
both are able to run on this cpu.

yeah, enough speculations :-)

>
>> Anyway, the original (and my rework) algorithm is somewhat
>> sub-optimal. It considers only "p->rt.nr_cpus_allowed == 1", while the
>> ideal algorithm would behave as follows:
>>
>> 1) we get a newly woken-up task on CPU_i;
>>
>> 2) this task have a number (!= 1) of allowed CPUs (which is a sub-set
>> of all) _but_ all of them are busy at the moment serving high prio
>> tasks;
>
> I don't understand the significance of the stipulation that its a "sub-set of all".  Can you elaborate?  Or did you mean to say "it *may* be a sub-set of all"?

it should have been "rt_se->nr_cpus_allowed not _necessarily_ equal to
1", meaning rt_se->nr_cpus_allowed is a sub-set of ALL_CPUS, not
strictly a single cpu.

We consider a task to be "bound" if

"rt_se->nr_cpus_allowed == 1"

i.e. a task needs to be bound to a _single_ cpu.

what I mean is e.g.

say, there are 4 cpus and cpu_0 and cpu_1 are busy serving task_0 and task_1;

there are also task_3 and task_4 which have equal prios _but_ lower
(worse) than those of task_0 and task_1.

task_3 can run on _any_ cpu but task_4 can run only on 2 of them --
cpu_1 and cpu_3.

task_3 is running on cpu_3

task_4 is woken up... with your/mine implementation it won't get cpu_3
as it's _not_ considered "bound" (rt_se->nr_cpus_allowed == 2)

while in fact, task_4 still might have been preempted and moved onto cpu_4.

that's why I called the current (mine is similar)
definition/implementation of "bound" (or "non-migratable" in my terms
above) sub-optimal.

hope it's clear now (and none of the important details are escaping me) :-)


>> All in all, I think that the initial implementation is sub-optimal
>
> Ouch :)

ok, mine is no way better :-)

repeating myself, they address just a specific case
(rt_se->nr_cpus_allowed == 1) of the broader "problem" which I
describe above.


>
> Regards,
> -Greg
>

-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ