[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BC02C49EEB98354DBA7F5DD76F2A9E80030A7DD60C@azsmsx501.amr.corp.intel.com>
Date: Mon, 15 Dec 2008 13:49:59 -0700
From: "Ma, Chinang" <chinang.ma@...el.com>
To: Henrik Austad <henrik@...tad.us>
CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Wilcox, Matthew R" <matthew.r.wilcox@...el.com>,
"Van De Ven, Arjan" <arjan.van.de.ven@...el.com>,
"Styner, Douglas W" <douglas.w.styner@...el.com>,
"Chilukuri, Harita" <harita.chilukuri@...el.com>,
"Wang, Peter Xihong" <peter.xihong.wang@...el.com>,
"Nueckel, Hubert" <hubert.nueckel@...el.com>,
Chris Mason <chris.mason@...cle.com>,
"Tripathi, Sharad C" <sharad.c.tripathi@...el.com>
Subject: RE: CFS scheduler OLTP perforamnce
>-----Original Message-----
>From: Henrik Austad [mailto:henrik@...tad.us]
>Sent: Monday, December 15, 2008 8:57 AM
>To: Ma, Chinang
>Cc: Peter Zijlstra; Ingo Molnar; linux-kernel@...r.kernel.org; Wilcox,
>Matthew R; Van De Ven, Arjan; Styner, Douglas W; Chilukuri, Harita; Wang,
>Peter Xihong; Nueckel, Hubert; Chris Mason; Tripathi, Sharad C
>Subject: Re: CFS scheduler OLTP perforamnce
>
*snip*
>On Mon, Dec 15, 2008 at 08:32:41AM -0700, Ma, Chinang wrote:
>>
>> >Is this really a kernel-scheduler problem? Or is it an error in the way
>the
>> >timestamps are recoreded?
>> >
>> >Does not the time recorded then depend upon how the foreground process
>is
>> >scheduled, and not the log-writer? What happens if you log the time at
>the
>> >start and end of the log-writer function? Then you would get the time-
>delta
>> >between signaling and rr-wakeup, as well as time spent writing buffer to
>> >disk. By using the final timestamp in the foreground process, you'd get
>the
>> >latency for the last foreground-process wakeup as well.
>> >
>> >Or am I completely missing the point here?
>>
>> Since I don't have access to the source code, I cannot make the about
>change to the foreground or log writer.
>
>Aha, then it makes sense :-)
>
>What about, as a test, set the foreground process as rt-prio (not a good
>thing
>for a production environment, but as a test, it should give you an
>indication
>to how long time the wakeup-time is.
>
When setting foreground and log writer to rt-prio, the log latency reduced to 4.8ms. Performance is about 1.5% higher than the CFS result.
On a side note, we had been using rt-prio on all DBMS processes and log writer ( in higher priority) for the best OLTP performance. That has worked pretty well until 2.6.25 when the new rt scheduler introduced the pull/push task for lower scheduling latency for rt-task. That has negative impact on this workload, probably due to the more elaborated load calculation/balancing for hundred of foreground rt-prio processes. Also, there is that question of no production environment would run DBMS with rt-prio. That is why I am going back to explore CFS and see whether I can drop rt-prio for good.
>>
>> >
>> >> >How would you characterize the log tasks behaviour?
>> >> >
>> >> > - does it run long/short (any quantization)
>> >>
>> >> There were 371 log writes per second so ~2.7ms per log writer
>execution.
>> >> Out of this we know ~2.13 ms was spent waiting for log file i/o. Log
>> >writer
>> >> was running for (2.7ms - 2.13ms) = 0.57ms
>> >>
>> >> > - does it sleep long/short - how does it compare to its runtime?
>> >>
>> >> With the current throughput, log writer should be constantly writing
>log
>> >> and rarely sleep.
>> >>
>> >> > - does it wake others?
>> >> > - if so, always the one who woke it, or multiple others?
>> >>
>> >> Log writer wake multiple foreground processes using vector post.
>> >
>> >So, you don't really know if the initial process that recorded the
>> >timestamp
>> >is the one who is awoken - so the time taken could be *very* long?
>> >
>> Yes. If we only look at the timestamp in just one foreground process. The
>reported log latency is an average of the sum of all foreground stats.
>Since this average went down with the log writer set to SCHED_RR, I took
>that as log writer get to do its job sooner.
>
>Ah, I guess the time went down because the writer was scheduler much
>earlier,
>and this lead to decreased time, but as the foreground process needs to be
>waken up and scheduled *after* the writer has finished, you will still see
>some latencies here.
>
>I'm suspecting you record a lot of time waiting for the foreground process
>to be scheduled again.
>
>
>henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists