linux-kernel - Re: SCHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20151211222731.63d22a86@luca-1225C>
Date:	Fri, 11 Dec 2015 22:27:31 +0100
From:	Luca Abeni <luca.abeni@...tn.it>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Juri Lelli <juri.lelli@...il.com>,
	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: SCHED_RR vs push-pull

Hi Steven,

On Fri, 11 Dec 2015 14:53:59 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:
[...]
> > > The push-pull thing only acts when there's idle or new tasks, and
> > > in the above scenario, the CPU with only the single RR task will
> > > happily continue running that task, while the other CPU will have
> > > to RR between the remaining 3.  
> > I might be wrong, but I think this is due to the
> > 	if (lowest_rq->rt.highest_prio.curr <= task->prio) {
> > in rt.c::find_lock_lowest_rq().
> > I suspect that changing "<=" in "<" might fix the issue, at the
> > cost of generating a lot of useless tasks migrations.
> 
> I'm against this.
I agree; this was just a quick hack to check if my theory is correct
(and to work around the issue for someone needing an immediate solution
to this problem).


> Unless we only do this when current and the task we
> want to move are both RR. It might help.
Yes, in a proper solution a check for RR is needed. But I think it is
also important to check the number of tasks having the highest priority
on the two runqueues (otherwise, we risk to continuously bounce tasks
between the two CPUs).


> > 
> > > Now my initial thoughts were to define a global RR order using a
> > > virtual timeline and you'll get something like EEVDF on a per RR
> > > prio level with push-pull state between that.
> > > 
> > > Which might be a tad over engineered.  
> > I suspect this issue can be fixed in a simpler way, by changing the
> > check I mentioned above.
> 
> What happens when current is FIFO, and we just moved an RR task over
> that will now never run?
Uh... I did not think about it. Having SCHED_FIFO and SCHED_RR tasks at
the same priority level is... Dangerous. Probably, when a SCHED_FIFO
task is executing on a CPU all the SCHED_RR tasks with the same
priority must be pushed to other CPUs... And tasks should never be
pushed on a CPU where there is a SCHED_FIFO task with the same priority.

> > If you want to balance SCHED_RR tasks with the same priority, I
> > think the "lowest_rq->rt.highest_prio.curr <= task->prio" should be
> > extended to do the migration if:
> > - the local task has a higher priority than the highest priority
> > task on lowest_rq (this is what's currently done)
> > - the local task has the same priority of the highest priority task
> > on lowest_rq and they are SCHED_RR and the number of tasks with
> >   task->prio on the local RQ is larger than the number of tasks with
> >   lowest_rq->rt.highest_prio.curr on lowest_rq + 1.
> 
> Well, the number of tasks may not be good enough. We need to only look
> at RR tasks.
I agree... The proper check is more complex than what I wrote.

> Perhaps if current is RR and the waiter on the other CPU
> is RR, we can do a scan to see if a balance should be done.
Yes; but I suspect we should check how many RR tasks with this priority
are on this CPU, and how many RR tasks with this priority are on the
other CPU... When the difference is larger than 1, the task can be
pushed (otherwise, we risk again to bounce tasks between the two CPUs.
Think about 5 RR tasks, all with the same priority, and 2 CPUs: I guess
the best thing to do is to put 3 tasks on a CPU and 2 on the other, and
do not try to balance. Otherwise, we end up with a migration at every
timeslice).


> > I think this could work, but I just looked at the code, without any
> > real test. If you provide a simple program implementing a testcase,
> > I can do some experiments in next week.
> > 
> > The alternative (of course I have to mention it :) would be to use
> > SCHED_DEADLINE instead of SCHED_RR.
> 
> Hmm, I wonder if we could have a wrapper around SCHED_DEADLINE to
> implement SCHED_RR. Probably not, because SCHED_RR has hard coded
> priorities and SCHED_DEADLINE is more dynamic (and still higher than
> SCHED_FIFO).
I think we can do something similar to SCHED_RR (but we need to
implement a "soft" runtime enforcement), but of course SCHED_DEADLINE
cannot provide the fixed priority behaviour.



				Luca

> 
> > > 
> > > Happy thinking ;-)  
> 
> Heh, I originally thought Peter said "Happy Thanksgiving".
> 
> -- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/