linux-kernel - Re: [RFC mm][PATCH 2/5] percpu cached mm counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20091210172040.37d259d3.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Thu, 10 Dec 2009 17:20:40 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	cl@...ux-foundation.org,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	minchan.kim@...il.com
Subject: Re: [RFC mm][PATCH 2/5] percpu cached mm counter

On Thu, 10 Dec 2009 08:54:54 +0100
Ingo Molnar <mingo@...e.hu> wrote:

> 
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> 
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> > 
> > Now, mm's counter information is updated by atomic_long_xxx() 
> > functions if USE_SPLIT_PTLOCKS is defined. This causes cache-miss when 
> > page faults happens simultaneously in prural cpus. (Almost all 
> > process-shared objects is...)
> > 
> > Considering accounting per-mm page usage more, one of problems is cost 
> > of this counter.
> 
> I'd really like these kinds of stats available via the tool you used to 
> develop this patchset:
> 
> >  After:
> >     Performance counter stats for './multi-fault 2' (5 runs):
> > 
> >        46997471  page-faults                ( +-   0.720% )
> >      1004100076  cache-references           ( +-   0.734% )
> >       180959964  cache-misses               ( +-   0.374% )
> >    29263437363580464  bus-cycles                 ( +-   0.002% )
> > 
> >    60.003315683  seconds time elapsed   ( +-   0.004% )
> > 
> >    cachemiss/page faults is reduced from 4.55 miss/faults to be 3.85miss/faults
> 
> I.e. why not expose these stats via perf events and counts as well, 
> beyond the current (rather minimal) set of MM stats perf supports 
> currently?
> 
> That way we'd get a _lot_ of interesting per task mm stats available via 
> perf stat (and maybe they can be profiled as well via perf record), and 
> we could perhaps avoid uglies like having to hack hooks into sched.c:
> 

As I wrote in 0/5, this is finally for oom-killer, for "kernel internal use".


Not for user's perf evetns.

 - http://marc.info/?l=linux-mm&m=125714672531121&w=2

And Christoph has concerns on cache-miss on this counter.

 - http://archives.free.net.ph/message/20091104.191441.1098b93c.ja.html

This patch is for replcacing atomic_long_add() with percpu counter.


> > +	/*
> > +	 * sync/invaldidate per-cpu cached mm related information
> > +	 * before taling rq->lock. (see include/linux/mm.h)
> 
> (minor typo: s/taling/taking )
> 
Oh, thanks.

> > +	 */
> > +	sync_mm_counters_atomic();
> >  
> >  	spin_lock_irq(&rq->lock);
> >  	update_rq_clock(rq);
> 
> It's not a simple task i guess since this per mm counting business has 
> grown its own variant which takes time to rearchitect, plus i'm sure 
> there's performance issues to solve if such a model is exposed via perf, 
> but users and developers would be _very_ well served by such 
> capabilities:
> 
>  - clean, syscall based API available to monitor tasks, workloads and 
>    CPUs. (or the whole system)
> 
>  - sampling (profiling)
> 
>  - tracing, post-process scripting via Perl plugins
> 

I'm sorry If I miss your point...are you saying remove all mm_counter completely
and remake them under perf ? If so, some proc file (/proc/<pid>/statm etc)
will be corrupted ?

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/