[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090810083602.GA7176@balbir.in.ibm.com>
Date: Mon, 10 Aug 2009 14:06:02 +0530
From: Balbir Singh <balbir@...ux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, andi.kleen@...el.com,
Prarit Bhargava <prarit@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
"lizf@...fujitsu.com" <lizf@...fujitsu.com>,
"menage@...gle.com" <menage@...gle.com>,
Pavel Emelianov <xemul@...nvz.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: Help Resource Counters Scale Better (v3)
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> [2009-08-10 15:22:05]:
> On Mon, 10 Aug 2009 14:45:59 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
>
> > > Do you agree?
> >
> > Ok. Config is enough at this stage.
> >
> > The last advice for merge is, it's better to show the numbers or
> > ask someone who have many cpus to measure benefits. Then, Andrew can
> > know how this is benefical.
> > (My box has 8 cpus. But maybe your IBM collaegue has some bigger one)
> >
> > In my experience (in my own old trial),
> > - lock contention itself is low. not high.
> > - but cacheline-miss, pingpong is very very frequent.
> >
> > Then, this patch has some benefit logically but, in general,
> > File-I/O, swapin-swapout, page-allocation/initalize etc..dominates
> > the performance of usual apps. You'll have to be careful to select apps
> > to measure the benfits of this patch by application performance.
> > (And this is why I don't feel so much emergency as you do)
> >
>
> Why I say "I want to see the numbers" again and again is that
> this is performance improvement with _bad side effect_.
> If this is an emergent trouble, and need fast-track, which requires us
> "fix small problems later", plz say so.
>
OK... I finally got a bigger machine (24 CPUs). I ran a simple
program called parallel_pagefault, which does pagefault's in parallel
(runs on every other CPU) and allocates 10K pages and touches the
data allocated, unmaps and repeats the process. I ran the program
for 300 seconds. With the patch, I was able to fault in twice
the number of pages as I was able to without the patch. I used
perf tool from tools/perf in the kernel
With patch
Performance counter stats for '/home/balbir/parallel_pagefault':
7188177.405648 task-clock-msecs # 23.926 CPUs
423130 context-switches # 0.000 M/sec
210 CPU-migrations # 0.000 M/sec
49851597 page-faults # 0.007 M/sec
5900210219604 cycles # 820.821 M/sec
424658049425 instructions # 0.072 IPC
7867744369 cache-references # 1.095 M/sec
2882370051 cache-misses # 0.401 M/sec
300.431591843 seconds time elapsed
Without Patch
Performance counter stats for '/home/balbir/parallel_pagefault':
7192804.124144 task-clock-msecs # 23.937 CPUs
424691 context-switches # 0.000 M/sec
267 CPU-migrations # 0.000 M/sec
28498113 page-faults # 0.004 M/sec
5826093739340 cycles # 809.989 M/sec
408883496292 instructions # 0.070 IPC
7057079452 cache-references # 0.981 M/sec
3036086243 cache-misses # 0.422 M/sec
300.485365680 seconds time elapsed
--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists