[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101119031105.GC13830@dastard>
Date: Fri, 19 Nov 2010 14:11:05 +1100
From: Dave Chinner <david@...morbit.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@...el.com>, Jan Kara <jack@...e.cz>,
Christoph Hellwig <hch@....de>, Theodore Ts'o <tytso@....edu>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Mel Gorman <mel@....ul.ie>, Rik van Riel <riel@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
linux-mm <linux-mm@...ck.org>, linux-fsdevel@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/13] IO-less dirty throttling v2
On Wed, Nov 17, 2010 at 11:33:50PM -0800, Andrew Morton wrote:
> On Thu, 18 Nov 2010 18:27:06 +1100 Dave Chinner <david@...morbit.com> wrote:
>
> > > > Indeed, nobody has
> > > > realised (until now) just how inefficient it really is because of
> > > > the fact that the overhead is mostly hidden in user process system
> > > > time.
> > >
> > > "hidden"? You do "time dd" and look at the output!
> > >
> > > _now_ it's hidden. You do "time dd" and whee, no system time!
> >
> > What I meant is that the cost of foreground writeback was hidden in
> > the process system time. Now we have separated the two of them, we
> > can see exactly how much it was costing us because it is no longer
> > hidden inside the process system time.
>
> About a billion years ago I wrote the "cyclesoak" thingy which measures
> CPU utilisation the other way around: run a lowest-priority process on
> each CPU in the background, while running your workload, then find out
> how much CPU time cyclesoak *didn't* consume. That way you account for
> everything: user time, system time, kernel threads, interrupts,
> softirqs, etc. It turned out to be pretty accurate, despite the
> then-absence of SCHED_IDLE.
Yeah, I just use PCP to tell me what the CPU usage is in a nice
graph. The link below is an image of the "overview" monitoring tab I
have - total CPU, IOPS, bandwidth, XFS directory ops and context
switches. Here's the behaviour an increasing number of dd's with
this series looks like:
http://userweb.kernel.org/~dgc/io-less-throttle-dd.png
Left to right, that 1 dd, 2, 4, 8, 16 and 32 dd's, then a gap, then
the 8-way fs_mark workload running. These are all taken at a 5s
sample period.
FWIW, on the 32 thread dd (the right most of the set of pillars),
you can see the sudden increase in system CPU usage in the last few
samples (which corresponds to the first few dd's completing and
exiting) that I mentioned previously.
Basically, I'm always looking at the total CPU usage of a workload,
memory usage of caches, etc, in this manner. Sure, I use stuff like
time to get numbers to drop out of test scripts, but most of my
behavioural analysis is done through observing differences between
two charts and then looking deeper to work out what changed...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists