[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20091210172040.37d259d3.kamezawa.hiroyu@jp.fujitsu.com>
Date: Thu, 10 Dec 2009 17:20:40 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
cl@...ux-foundation.org,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
minchan.kim@...il.com
Subject: Re: [RFC mm][PATCH 2/5] percpu cached mm counter
On Thu, 10 Dec 2009 08:54:54 +0100
Ingo Molnar <mingo@...e.hu> wrote:
>
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
>
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> >
> > Now, mm's counter information is updated by atomic_long_xxx()
> > functions if USE_SPLIT_PTLOCKS is defined. This causes cache-miss when
> > page faults happens simultaneously in prural cpus. (Almost all
> > process-shared objects is...)
> >
> > Considering accounting per-mm page usage more, one of problems is cost
> > of this counter.
>
> I'd really like these kinds of stats available via the tool you used to
> develop this patchset:
>
> > After:
> > Performance counter stats for './multi-fault 2' (5 runs):
> >
> > 46997471 page-faults ( +- 0.720% )
> > 1004100076 cache-references ( +- 0.734% )
> > 180959964 cache-misses ( +- 0.374% )
> > 29263437363580464 bus-cycles ( +- 0.002% )
> >
> > 60.003315683 seconds time elapsed ( +- 0.004% )
> >
> > cachemiss/page faults is reduced from 4.55 miss/faults to be 3.85miss/faults
>
> I.e. why not expose these stats via perf events and counts as well,
> beyond the current (rather minimal) set of MM stats perf supports
> currently?
>
> That way we'd get a _lot_ of interesting per task mm stats available via
> perf stat (and maybe they can be profiled as well via perf record), and
> we could perhaps avoid uglies like having to hack hooks into sched.c:
>
As I wrote in 0/5, this is finally for oom-killer, for "kernel internal use".
Not for user's perf evetns.
- http://marc.info/?l=linux-mm&m=125714672531121&w=2
And Christoph has concerns on cache-miss on this counter.
- http://archives.free.net.ph/message/20091104.191441.1098b93c.ja.html
This patch is for replcacing atomic_long_add() with percpu counter.
> > + /*
> > + * sync/invaldidate per-cpu cached mm related information
> > + * before taling rq->lock. (see include/linux/mm.h)
>
> (minor typo: s/taling/taking )
>
Oh, thanks.
> > + */
> > + sync_mm_counters_atomic();
> >
> > spin_lock_irq(&rq->lock);
> > update_rq_clock(rq);
>
> It's not a simple task i guess since this per mm counting business has
> grown its own variant which takes time to rearchitect, plus i'm sure
> there's performance issues to solve if such a model is exposed via perf,
> but users and developers would be _very_ well served by such
> capabilities:
>
> - clean, syscall based API available to monitor tasks, workloads and
> CPUs. (or the whole system)
>
> - sampling (profiling)
>
> - tracing, post-process scripting via Perl plugins
>
I'm sorry If I miss your point...are you saying remove all mm_counter completely
and remake them under perf ? If so, some proc file (/proc/<pid>/statm etc)
will be corrupted ?
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists