linux-kernel - Re: Regression in latest sched-git

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080213163444.GA19570@linux.vnet.ibm.com>
Date:	Wed, 13 Feb 2008 22:04:44 +0530
From:	Dhaval Giani <dhaval@...ux.vnet.ibm.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	vatsa@...ux.vnet.ibm.com, Ingo Molnar <mingo@...e.hu>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: Regression in latest sched-git

On Wed, Feb 13, 2008 at 01:51:18PM +0100, Peter Zijlstra wrote:
> 
> On Wed, 2008-02-13 at 08:30 +0530, Srivatsa Vaddagiri wrote:
> > On Tue, Feb 12, 2008 at 08:40:08PM +0100, Peter Zijlstra wrote:
> > > Yes, latency isolation is the one thing I had to sacrifice in order to
> > > get the normal latencies under control.
> > 
> > Hi Peter,
> > 	I don't have easy solution in mind either to meet both fairness
> > and latency goals in a acceptable way.
> 
> Ah, do be careful with 'fairness' here. The single RQ is fair wrt cpu
> time, just not quite as 'fair' wrt to latency.
> 
> > But I am puzzled at the max latency numbers you have provided below:
> > 
> > > The problem with the old code is that under light load: a kernel make
> > > -j2 as root, under an otherwise idle X session, generates latencies up
> > > to 120ms on my UP laptop. (uid grouping; two active users: peter, root).
> > 
> > If it was just two active users, then max latency should be:
> > 
> > 	latency to schedule user entity (~10ms?) +
> > 	latency to schedule task within that user 
> > 
> > 20-30 ms seems more reaonable max latency to expect in this scenario.
> > 120ms seems abnormal, unless the user had large number of tasks.
> > 
> > On the same lines, I cant understand how we can be seeing 700ms latency
> > (below) unless we had: large number of active groups/users and large number of 
> > tasks within each group/user.
> 
> All I can say it that its trivial to reproduce these horrid latencies.
> 

Hi Peter,

I've been trying to reproduce the latencies, and the worst I have
managed only 80ms. At an average I am getting around 60 ms. This is with
a make -j4 as root, and dhaval running other programs. (with maxcpus=1).

> As for Ingo's setup, the worst that he does is run distcc with (32?)
> instances on that machine - and I assume he has that user niced waay
> down.
> 
> > > Others have reported latencies up to 300ms, and Ingo found a 700ms
> > > latency on his machine.
> > > 
> > > The source for this problem is I think the vruntime driven wakeup
> > > preemption (but I'm not quite sure). The other things that rely on
> > > global vruntime are sleeper fairness and yield. Now while I can't
> > > possibly care less about yield, the loss of sleeper fairness is somewhat
> > > sad (NB. turning it off with the old group scheduling does improve life
> > > somewhat).
> > > 
> > > So my first attempt at getting a global vruntime was flattening the
> > > whole RQ structure, you can see that patch in sched.git (I really ought
> > > to have posted that, will do so tomorrow).
> > 
> > We will do some exhaustive testing with this approach. My main concern
> > with this is that it may compromise the level of isolation between two
> > groups (imagine one group does a fork-bomb and how it would affect
> > fairness for other groups).
> 
> Again, be careful with the fairness issue. CPU time should still be
> fair, but yes, other groups might experience some latencies.
> 

I know I am missing something, but aren't we trying to reduce latencies
here?

-- 
regards,
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/