linux-kernel - Re: [Patch] mm tracepoints update

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A38F885.8040009@redhat.com>
Date:	Wed, 17 Jun 2009 10:07:01 -0400
From:	Larry Woodman <lwoodman@...hat.com>
To:	Rik van Riel <riel@...hat.com>
CC:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	Fr馘駻ic Weisbecker <fweisbec@...il.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	eduard.munteanu@...ux360.ro, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, rostedt@...dmis.org, lwoodman@...hat.com,
	Linda Wang <lwang@...hat.com>
Subject: Re: [Patch] mm tracepoints update - use case.

Rik van Riel wrote:
>
> Sorry I am replying to a really old email, but exactly
> what information do you believe would be more useful to
> extract from vmscan.c with tracepoints?
>
> What are the kinds of problems that customer systems
> (which cannot be rebooted into experimental kernels)
> run into, that can be tracked down with tracepoints?
>
> I can think of a few:
> - excessive CPU use in page reclaim code
> - excessive reclaim latency in page reclaim code
> - unbalanced memory allocation between zones/nodes
> - strange balance problems between reclaiming of page
>   cache and swapping out process pages
>
> I suspect we would need fairly fine grained tracepoints
> to track down these kinds of problems, with filtering
> and/or interpretation in userspace, but I am always
> interested in easier ways of tracking down these kinds
> of problems :)
>
> What kinds of tracepoints do you believe we would need?
>
> Or, using Larry's patch as a starting point, what do you
> believe should be changed?
>

Rik, I know these mm tracepoint patches produce a low of output in the 
trace buffer.
In a nutshell what I have done is to add them in critical locations in 
places that allocate
memory, map that memory in user space, unmap it from user space, and 
free it.  In addition,
I have added tracepoints to important places in the memory allocation 
and reclaim paths so
we can see failures, stalls, high latencies as well as normal behavior.  
Finally I added them
to the pdflush operations so we can determine amounts of memory written 
back to disk there
versus the swapout paths.  Perhaps if this is too many tracepoints all 
at once we could focus
mainly on those specific to the page reclaim code path since that is 
where most contention
occurs?

Anonymous memory tracepoints:
1.) mm_anon_fault - initial anonymous pagefault.
2.) mm_anon_unmap - anonymous unmap triggered by page reclaim.
3.) mm_anon_userfree - anonymous memory unmap by user.
4.) mm_anon_cow - anonymous COW fault
5.) mm_anon_pgin - anonymous pagein from swap.

Filemap memory tracepoints:
1.) mm_filemap_fault - initial filemap fault.
2.) mm_filemap_cow - filemap COW fault.
3.) mm_filemap_userunmap - filemap unmap by user.
4.) mm_filemap_unmap - filemap unmap triggered by page reclaim.

Page allocation failure tracepoints:
1.) mm_page_allocation - page allocation that fails and causes page reclaim.

Page kswapd and direct reclaim tracepoints:
1.) mm_kswapd_ran - kswapd ran and tells us how many pages it reclaimed.
2.) mm_directreclaim_reclaimall - direct reclaim because free lists were 
below min.
3.) mm_directreclaim_reclaimzone - direct reclaim of a specific numa node.

Inner workings of the page reclaim tracepoints:
1.) mm_pagereclaim_shrinkzone - shrink zone, tells us how many pages 
were scanned.
2.) mm_pagereclaim_shrinkinactive - shrink inactive list, tells us how 
many pages were deactivated.
3.) mm_pagereclaim_shrinkactive - shrink inactive list, tells us how 
many pages were processed
4.) mm_pagereclaim_pgout - pageout, tells us which pages were paged out.
5.) mm_pagereclaim_free - tells us how many pages were freed in each 
page reclaim invocation.

Pagecache flushing tracepoints:
1.) mm_balance_dirty - tells us how many pages were written when dirty 
was above dirty_ratio.
2.) mm_pdflush_bgwriteout - tells us how many pages written when dirty 
was above dirty_background_ratio.
3.) mm_pdflush_kupdate - tells us how many pages kupdate wrote.

View attachment "mmtracepoints-617.diff" of type "text/plain" (17240 bytes)