[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090509103646.GA16138@elte.hu>
Date: Sat, 9 May 2009 12:36:46 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Wu Fengguang <fengguang.wu@...el.com>
Cc: Frédéric Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Li Zefan <lizf@...fujitsu.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Andi Kleen <andi@...stfloor.org>,
Matt Mackall <mpm@...enic.com>,
Alexey Dobriyan <adobriyan@...il.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [patch] tracing/mm: add page frame snapshot trace
* Wu Fengguang <fengguang.wu@...el.com> wrote:
> > Preliminary timings on an older, 1GB RAM 2 GHz Athlon64 box show
> > that it's plenty fast:
> >
> > # time echo -1 > /debug/tracing/objects/mm/pages/trigger
> >
> > real 0m0.127s
> > user 0m0.000s
> > sys 0m0.126s
> >
> > # time cat /debug/tracing/per_cpu/*/trace_pipe_raw > /tmp/page-trace.bin
> >
> > real 0m0.065s
> > user 0m0.001s
> > sys 0m0.064s
> >
> > # ls -l /tmp/1
> > -rw-r--r-- 1 root root 13774848 2009-05-09 11:46 /tmp/page-dump.bin
> >
> > 127 millisecs to collect, 65 milliseconds to dump. (And that's not
> > using splice() to dump the trace data.)
>
> That's pretty fast and on par with kpageflags!
It's already faster here than kpageflags, on a 32 GB box i just
tried, and the sum of timings (dumping + reading of 4 million page
frame records, into/from a sufficiently large trace buffer) is 2.8
seconds.
current upstream kpageflags is 3.3 seconds:
phoenix:/home/mingo> time cat /proc/kpageflags > /tmp/1
real 0m3.338s
user 0m0.004s
sys 0m0.608s
(although it varies around a bit, sometimes back to 3.0 secs,
sometimes more)
That's about 10% faster. Note that output performance could be
improved more by using splice().
Also, it's apples to oranges, in an unfavorable-to-ftrace way: the
pages object collection outputs all of these fields:
field:unsigned short common_type; offset:0; size:2;
field:unsigned char common_flags; offset:2; size:1;
field:unsigned char common_preempt_count; offset:3; size:1;
field:int common_pid; offset:4; size:4;
field:int common_tgid; offset:8; size:4;
field:unsigned long pfn; offset:16; size:8;
field:unsigned long flags; offset:24; size:8;
field:unsigned long index; offset:32; size:8;
field:unsigned int count; offset:40; size:4;
field:unsigned int mapcount; offset:44; size:4;
plus it generates and outputs the timestamp as well - while
kpageflags is just page flags. (and kpagecount is only page counts)
Spreading the dumping+output out to the 16 CPUs of this box would
shorten the run time at least 10-fold, to about 0.3-0.5 seconds
IMHO. (but that has to be tried and measured first)
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists